.IIIW .‘~.~ III III? @- . [III "’I I'IIIIIIIIIIII '7’I‘IIIIII 'JIIIII’I WWII. "IIIIII *. . 'ISIII W IIIIIIIIIII I" IV WWII IIIIIII'IIIII III‘III'III IIIIIIIII IIIIIIIIIIIIII IIIIII IIIIIIIII', III” IIIII LIIII IIIIIIIIIIIII IIIIIIIIIIII' IIvI"'"..’II ‘I-IIIII'II. II" IIII'IIII IIIIIIII'” III II IIIIII-I‘ II , II 2:637: I I III III "' III II III IIIII II‘III le IIIIIHII' II III II. I III III IIII’IIIIIII III I I ’III‘I IIIIIIII III IIII II IIIIINIIII III I I I lIII: WI IIIIII‘ II:~ . III IIIIIIIIIIIII II I . I: III III III. I II I III, ., :Et__ II {III ngI. “gm” .‘II I I III ’I’ III' III I. I .. I I III I'I‘” III, II I“ ‘2’ I. “III III; III III" IIIII III III I I” ”II III: III II 35’" I” Iii-“EIEIJ‘ILIIIvan; “I III I ...... H W I W‘ Idg yv‘_; gig? ' III I II III, IIIIIII III 'IIIIII IJISIIII‘IIII‘ “:EIII 2III] Inf“? IJII-‘L III; III? 'I‘Iu-‘g‘I' III ' w III'III’IIIIIII III “II III II'II l. n I o a - I I‘l -..n .' . H II ' ..x‘.':.‘ 'V': ..' . ""P‘ I. I -. 'II I . . V A] 3.1:. 1 12?: 4:23: IN- ’7‘” ‘g% I ‘1‘ ‘W‘LZ. :' l. 'ZI'J'fl ' '4'. 5&4“ -II':;I: IIII I37? I'M III III IIII‘I' III-I IvI'II IIIIIII .IIIU" ‘IIIII “dd. III.“ Ié’zJIIr hIIIhfl 1"“ 3" It"? I‘c'gwI ' o . I IIIIIIII‘III‘IIII II.” “W“ :I IIII I IN IIIII MI INWI “ILIIII IIILINIIIIINI'IL‘ ,I'IIII WII I” III 'IfiUI III; I ilIIItéI ;:? 'I. I'II III I ' II III II IIIIII . IIIII I L': '... ,7 .I‘PI '3' II I "III “III" III III I I“ . III -~ .I‘.“I’. ...= ..r I* IIII I. .In Iv I' III I”. [IRNIII%§¥%§§I¥i£}I§nf%‘ I-..-- I ' I’ " " " » . . .. v- 3-3342: ‘ 'IIII'I IIIIIIIII II""¥I' IIIIII‘“ .. IIIIIII 'I III-Ia 'I' IIII IIIIII IIIIIIII IIIIIIIIIII IIIII IIIIII IIIIII‘III IIII'IIIIIIIII I II 'I'IIIII I III III| II II - _II:"'I -. , y ' ' 2 III-IN IIIIIIIIII I I.III'I IIII'I'II . I"‘ "LI 'j IIIIII‘I'I‘ I |I: III: I I, .L -. II“. II" II" '3 ’i‘ II II II I» I..: .. fw I; ' . Il fI‘ W “'QH‘ 7IHWHIIHILHI I III . II'I . I..~ IIIIII IIII'IIII-I III II II'IIIIII IIII‘I IIII III I'.'" IIII I‘I'I' II'II 'I' III“ " , I . III | IIII-II' I: I.|‘ I. I ’7‘". II." I. III” III « II III . I I . .. I II. III'IJ'II‘I IlI' III°1I :7 III‘III I:' I [III i IIH'I' I III 'II 'I II' - , I III , .I 1' III-I, IIII‘IIIII,. ' III'I- -'- I“ IIIII IIII IIII I I II I I I I'I‘I II ‘IIII « I II' I.’ I. .I. . I I ’ I I 'I I I l ' IIIII IIII . I I I IIIIIIII‘ 'I ‘I 1”I m.“ I II IV'“I|(I LI HIIWII'HHI flfll ”II IIII“ IIMII Iam'fia I III“ IIIIII. 'I‘II I :I1'- I II‘ I' -.. II I I-IO' II ”III! IM-I‘ o‘m‘ A Mum: MBIIIIIIIIimnIIII M 1:“.th ;: mIIunIIII' I . I “Itfi III”. 71¢:ng This is to certify that the thesis entitled AN OBSERVATIONAL STUDY OF THE RELATIONSHIP BETWEEN DIAGNOSIS AND REMEDIATION IN READING presented by Annette B. Weinshank has been accepted towards fulfillment of the requirements for DOCTOR OF PHILOSOPHY degreein EDUCATIONAL PSYCHOLOGY oglziw Major professor Date November 19, 1979 0-7 639 *1 p g ‘ mm _.' l. I i. 1"“,r”~ '1 4 :ff-p 'j‘. ,'/\;-.‘," JUL ‘1‘1’9‘2906 *‘ OVERDUE FINES: 25¢ per day per item RETURNING LIBRARY MATERIALS: Place in book return to remov charge from circulation recur AN OBSERVATIONAL STUDY OF THE RELATIONSHIP BETWEEN DIAGNOSIS AND REMEDIATION IN READING By Annette Barshefsky Weinshank A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY Department of Educational Psychology 1980 ABSTRACT AN OBSERVATIONAL STUDY OF THE RELATIONSHIP BETWEEN DIAGNOSIS AND REMEDIATION IN READING By Annette Barshefsky Neinshank Purpose This study empirically investigated the clinical problem solving behavior of reading specialists during a total of twenty-four obser- vational sessions as they (I) diagnosed simulated cases of reading difficulty, (2) prepared an initial remediation plan, and (3) asso- ciated given remedial statements with diagnostic statements. The central purpose of the study was to assess whether reading specialists' diagnoses lead directly to remedial recommendations within and across sessions for given simulated cases. Procedures Eight experienced, credentialled, practicing reading specialists participated in the study. Four received their training entirely in Michigan, and four entirely in Illinois. Each clinician performed a series of tests using three simulated cases of reading difficulty over a period of at least one week. Two of the cases were thinly disguised versions of the same reading problem. The third represented a different reading problem. During each of her three sessions, each subject was asked to: reach a judgment about a case and write a diagnosis; write an initial remediation plan; number and code all key diagnostic statements and transfer them to a standardized diagnostic checklist; number and code all key remedial statements and transfer them to a standardized remedial checklist; give the number and code of the diagnostic statement or statements (if any) associated with a given remedial state- ment; respond to questions dealing with associations and non- associations between remedial and diagnostic statements; and comment on three free-response questions. Major Findings A small proportion of the total diagnostic and remedial statements/associations made accounted for agreement across two or more of the six sessions for a given case. The bulk of the statements/associations for a case were idiosyncratic, i.e., the statements/associations were made in only one session. Across all cases a relatively small number of categories of diagnostic and remedial statements accounted for all state- ments made in two or more sessions. Examination of common case information lead neither to common diagnoses, common remediations, nor common associations between remediation and diagnosis. Agreement between and within clinicians ranged from very little to none whatever. Only by aggregating diagnostic and remedial statements/ associations across the six sessions could the outlines of a meager consensus on each case be demonstrated. At the individual clinician level, there was essentially no correlation between diagnosis and remediation. At the group level, diagnosis and remediation showed a modest level of association. Clinicians never followed their stated plans regarding information collection procedures and the writing of the diagnosis and remediation. There was no difference in performance between the Michigan and Illinois subjects. @ Copyri ght by ANNETTE BARSHEFSKY WEINSHANK 1980 Ben Zoma said, "Who is wise? He who learns from all men; as it is said, from all my teachers I have gotten understanding." Avoth: Sayings of the Fathers, IV, 1. To teachers from whom I continue to learn, Professors _ Lee Shulman John Vinsonhaler Arthur Elstein George Sherman, this dissertation is dedicated. ii ACKNOWLEDGEMENTS I am indebted to a diverse band of colleagues and associates. The members of my doctoral committee, Professors Lee Shulman, John Vinsonhaler, Arthur Elstein, and George Sherman are exceptional. They delight in the work of scholar and teacher and have shared both roles freely. My graduate years have been a joy and delight thanks, in large measure, to their example and company. The computer systems staff of the Clinical Studies Group of the Institute for Research on Teaching at Michigan State University, Chris Wagner and Francine Kitchen, were always ready to create or modify the computer programs necessary for my analysis of the data presented in this study. Ruth Polin, office manager for the group, and Debbie Salters, secretary, were always game for one more editing of computer printout or yet another round of data entry. The final thesis typing, that long-awaited event, was efficiently O and competently done by Linda Raab. This dissertation reports the results of an observational study, which means that people had to agree to act as subjects and be observed in the performance of certain tasks. It was my great good fortune to be able to work with eight reading Specialists who were not only highly professional and competent practitioners, but who also were gracious iii enough to express interest in, and encouragement of, the work being done in the study. A key requirement in the design of this investigation was the replication of the study using subjects who had received all their graduate training outside of Michigan. I was able to carry out this phase of the dissertation thanks to the generous help and cooperation of Professor Ahmed Fareed, chairperson of the Department of Reading, Northeastern Illinois University, Chicago, Illinois. Professor Fareed extended every courtesy to me during my data-collection weeks at Northeastern in May and June of 1979, making my stay both a profitable and pleasurable one. A very special thanks are in order, too, for members of his staff: Dr. Wayne Berridge, who let me take over his office for a number of observational sessions, and Ms. Dorothy Bacon, whose superb organizational skills contributed so\much to the success of my visit. Finally, I salute my family. My sons, Joshua and Joel, provided a fine corrective to the rigors of my doctoral program by remaining steadfastly more interested in their lives than in mine these past years. To my husband, Don--teacher, scholar, and boon companion--I offer singular thanks for always sharing, but never appropriating, my times of doubt and worry and my moments of success and excitement. iv TABLE OF CONTENTS Page LIST OF TABLES ........................ viii LIST OF FIGURES ....................... x CHAPTER ONE: THE PROBLEM .................. 1 Introduction ...................... 1 Background of the Study ................. Overview of the Study .................. 5 CHAPTER TNO: AN HISTORICAL PERSPECTIVE ON THE ASSOCIATION BETWEEN DIAGNOSIS AND REMEDIATION IN READING: l900-1979 . . . 6 Introduction ...................... 6 1890-1910 ........................ 7 lQlO-l925 ........................ 7 1925-l935 ..... . .................. 10 1935-1950 ........................ ll l950-1965 ........................ l3 1965-1979 ........................ l5 Background of the Current Study ............. ' 25 CHAPTER THREE: DESIGN AND METHOD .............. 30 Introduction ...................... 30 Research Setting .................... 32 Design of the Study ................... 33 The Subjects ...................... 34 Method ......................... 35 Data Analysis ...................... 40 Agreement Statistics .................. 42 Additional Analytic Procedures ............. 45 Summary ......................... 45 CHAPTER FOUR: RESULTS .................... 47 Introduction ...................... 47 Proportional Agreement ................. 55 (l Diagnosis ..................... 55 (2 Remediation .................... 57 (3) Cues ....................... 63 (4 Remediated Diagnosis ............... 66 (5 Specific Remedial/Diagnostic Associations ..... 68 V Page Characteristics of Unassociated Remedial and Diagnostic Statements ................. 76 Characteristics of Associated Remedial and Diagnostic Statements ................. 80 Commonality ....................... 81, Intercorrelations .................... 82 Comparison of Michigan and Illinois Subjects on All Products .................... 89 Correlational Data ................... 91 (1) By Individual ................... 91 (2) By Group ..................... 92 Clinician Self-Reports ................. 92 Comments About the Sessions ............... 98 CHAPTER FIVE: DISCUSSION .................. 99 CHAPTER SIX: SUMMARY AND RECOMMENDATIONS .......... 105 Introduction ...................... 105 Summary of Procedures .................. 105 Summary of Results ................... 106 Proportional Agreement ................ 106 Commonality ...................... 107 Intercorrelations ................... 108 Comparison of Subjects ................ 108 Correlational Data: Individual and Group ....... 108 Clinician Self-Reports ................ 109 Conclusions ....................... 109 Recommendations ..................... lll APPENDICES .......................... 115 A. CASE INFORMATION INVENTORY ............. 115 B. OBSERVATIONAL STUDY INSTRUCTIONS .......... 118 C. CUE COLLECTION FORM ................. 131 D. SAMPLE DIAGNOSIS .................. 132 E. DIAGNOSIS CHECKLIST ................. 133 F. SAMPLE REMEDIATION ................. 137 G. REMEDIAL CHECKLIST ................. 138 H. REMEDIATION/DIAGNOSIS ELEMENT FORM: SESSIONS ONE AND THO ONLY .............. I45 vi K L M. N O p Q. BIBLIOGRAPHY ......................... DIAGNOSTIC/REMEDIAL ELEMENT FORM: SESSION THREE ONLY ................. DIAGNOSIS ONLY/REMEDIATION ONLY ELEMENT FORM: SESSIONS ONE AND TWO ................ REMEDIATION AND DIAGNOSIS GUIDE SHEET ........ REMEDIATION ONLY GUIDE SHEET AND RESPONSE FORM DIAGNOSIS ONLY GUIDE SHEET AND RESPONSE FORM CALCULATION OF PHI CORRELATION ........... MASTER DIAGNOSIS AND REMEDIATION .......... PROPORTIONAL AGREEMENT FIGURES ........... INTERCORRELATION RANGES ............... vii Page 146 Table Atom 11. 12. 13. 14. 15. 16. LIST OF TABLES Overall Design of the Study .............. Summary of Contents of Sessions ............ Profile of Participating Subjects ........... Proportion of Diagnostic Statements Mentioned in Two or More Sessions ............... Proportion of Weakness Statements in Set of Diagnostic Statements ............ Agreed-Upon Diagnostic Statements Across Cases Proportion of Remedial Statements Mentioned in Two or More Sessions ............... Proportion of Treatment Statements in Set of Remedial Statements ............. Agreed-Upon Remedial Statements Across Cases ..... Proportion~ of "Other/OTHER" Statements in Set of Checklist Statements ............. Proportion of Cues Mentioned in Two or More Sessions ................ Proportion of Requests for Test Booklets/Audio Tapes Proportion of Remediated Diagnoses to Total Diagnostic Statements ............. Proportion of Remediated Diagnoses Mentioned in Two or More Sessions ............... Proportion of Remedial/Diagnostic Associations (Two or More Sessions) ............... Proportion of Treatment/Weakness Associations in Set of All Remedial/Diagnostic Associations ..... viii Page 34 34 36 55 56 58 57 59 61 62 54 64 67 67 68 69 Table Page 17. Proportions of Idiosyncratic and Agreed-Upon Statements Across Products ............. 73 18. Summary Table: Proportion of Statements for Each Session Across Products ............ 74 19. Total Number of Statements for Each Session Across Products ................... 75 20. Unmatched Diagnostic Statements ............ 77 21. Mean Commonality Scores for All Products ....... 83 22. Mean Inter-Clinician Correlation ........... 85 23. Partial Intercorrelation Matrix for Diagnosis ..... 86 24. Mean Intra-Clinician Correlation ........... 87 25. Partial Intracorrelation Matrix for Diagnosis ..... 88 ix LIST OF FIGURES FIGURE PAGE 1. Proportional Agreement: All Products ......... 72 CHAPTER ONE THE PROBLEM Introduction This dissertation was designed to answer one question: is remediation reliably related to diagnosis in reading? If the two were found to be closely related then the almost constant assertion in the literature reviewed in this study, namely that diagnosis is an indispen- sible prerequisite to remediation, would be demonstrated empirically. If, on the other hand, diagnosis and remediation were found to be un- related, or minimally related, as has been suggested by a few contem- porary researchers (Chapter Two) then it would be necessary to (1) de- termine in what ways the two activities were unrelated, (2) account for the findings, and (3) examine the implications of the findings for the continuation of current practices in the areas of both clinical care of children with reading difficulties, and of teacher education in reading. Since its origin within the European medical comunity early this century (Chapter Two), the literature on remedial reading has persis- tently endorsed the necessity of obtaining two (dia) types of knowledge (gnosis) as a prerequisite for effective intervention: knowledge of symptoms and knowledge of underlying causes (Stein, 1973). Accordingly, even at present, training procedures for reading specialists continue to stress the necessity of gathering information from a variety of 1 sources including information from schools, parents, agencies working with the family, health and developmental history, and analytic testing in a variety of areas including personality (CEMREL, 1979). Consider- able resources of time and money are spent in reading clinics and schools for specialists to gather information about children, adminis- ter tests to them, and prepare diagnoses and suggested remediations. The major justification for all this activity would appear to be the stated conviction that effective remediation can procede only after a thorough diagnosis has been prepared. The fundamental assumption, then, is that diagnosis and remediation are so strongly related that to proceed to treat in the absence of an eclectic and thorough diagnosis would be both ineffective and irresponsible. What if this assumption is incorrect? If diagnosis and remediation as presently executed are shown to be unrelated, serious issues would be raised about the usefulness of the approach taken by present programs devoted to training in the diagnosis and remediation of reading difficulties. The value of ex- tended diagnostic testing sessions and data gathering activities in clinics would come under question. The role of the reading consultant in the schools would have to be reevaluated. In sum, demonstrating the inaccuracy of the assumption would provoke marked consequences throughOut the educational system. This study empirically investigated the clinical problem solving behavior of reading specialists as they examined information contained in simulated cases (SIMCASES) of reading difficulty in order to demon- strate whether or not any reliable relationship existed between diagnosis and remediation. Specifically, five questions were seen to be embedded within the major question posed: 1. How reliable, or consistent were the diagnoses? 2. How reliable were the remediations? 3. How reliable were information collection procedures? 4. How reliable were diagnostic statements to which remediations had been attached? 5. How reliable were specific remedial/diagnostic associations? Additionally, three related questions were to be addressed: 1. What was the strength of correlation between diagnostic and remedial performance? 2. Was there a discernable institutional training effect on per- formance? 3. How did the specialists themselves view the tasks of diagnosis and remediation as they engaged in those activities? To obtain the necessary data, eight experienced, credentialled, practicing reading specialists, four from Michigan and four from Illinois, were asked to examine three simulated cases of reading diffi- culty in order to (1) make diagnostic judgments, (2) prepare initial remediation plans, and (3) associate given remediations with diagnostic statements. In all, 24 sessions were conducted. Background of the Study Studies conducted over the past four years by the Clinical Studies Group of the Institute for Research on Teaching have developed tech- niques for studying and measuring the diagnostic clinical problem solving behavior of (1) reading specialists (Vinsonhaler, 1979; Weinshank, 1978; Hoffmeyer, 1979), (2) learning disabilities specialists (Van Roekel, 1979), and (3) classroom teachers (Gil, 1979; Stratoudakis, 1979). These investigations were based on the Inquiry Project studies in medical problem solving conducted at Michigan State University (Elstein, Shulman, and Sprafka, 1978; Vinsonhaler, Wagner, and Molidor, 1975). The results of the medical studies led to the formulation of the Inquiry Theory of clinical problem solving (Vinsonhaler, Wagner, and Elstein, 1978). The Inquiry Theory proposed that clinical diagnosis is determined by the interaction of (1) clinical memory (consisting of sets of problems, cues, cue values, diagnoses, treatment and the relationships among them), (2) clinical strategy (the sequencing of the mental tasks performed by the clinician), and (3) the case (Gil, Hoffmeyer, Van Roekel, and Weinshank, p. 7, 8). According to the theory, adequate knowledge of the relationship between cues (case information) and problems is central to correct diagnosis, and adequate knowledge of the relationship between the same problems and their treatments is central to effective remediation. In Other words, the case information gathered by the clinician should suggest likely problems and these, in turn, should be associated in memory with effective treatments. In short, according to this formu- lation, treatments are dependent upon the identification of problems. Put another way, remediation is dependent on prior diagnosis. The central question of this dissertation was whether or not reading specialists' diagnoses did, in fact, lead directly and reliably to remedial recommendations as the Inquiry Theory predicts they should, and as the literature contends they must. Overview of the Study In Chapter Two, the growth of the field of remedial reading is traced from its origins early this century within the European medical community to the present. The variations on the still dominant diagnosis-treatment model are examined. The chapter concludes with a report of a pilot study (Weinshank, 1979) the results of which call into question the basic assumptions of the diagnosis-remediation model and establish the guidelines for the present study. In Chapter-Three, the research questions posed by the disser— tation are reviewed, The research setting in which the data were collected is described, together with the design and the methods used for data gathering and analysis. The results reported in Chapter Four speak directly to the sequence of questions posed at the beginning of Chapter One. Speculations on some alternative explanations for the study's results are discussed in Chapter Five. Methodological limitations are also discussed. A summary of the study and recommendations for further work are presented in Chapter Six. CHAPTER TWO AN HISTORICAL PERSPECTIVE ON THE ASSOCIATION BETWEEN DIAGNOSIS AND REMEDIATION IN READING: 1900-1979 Introduction The idea that children having difficulty in reading must first be diagnosed before being given special teaching is a twentieth century phenomenon in the United States. The field of remedial reading as reflected in the literature has had at its base the stated belief that diagnosis and remediation are bound together. Deficiencies within the learner can be detected, their causes ascertained, and corrective action taken in order that a more desirable level of proficiency results. The diagnosis-treatment model has directed both thought and practice in the field since its origin early this century within the European medical profession, when congenital alexia, or 'word blind- ness' was described as being the cause of retardation in reading (Nila Smith, 1967). Remediation consisted of training children in a technique called the alphabetic-spelling method. The diagnosis-treatment model has undergone a number of per- mutations during the course of this century but the basic assumption underlying the model has remained intact: diagnosis (whatever the scope) and treatment (however conducted) must be related. Nila Smith and Harris (1967, 1977) present a chronology of instructional and diagnostic-remedial eras in the teaching of reading 6 during the twentieth century. The review of the literature presented in this chapter will be organized according to their chronologies. 1890-1910 The cultural influences during this period mandated that reading serve two main purposes: (a) that it enable the reader to gather knowledge from the author as printed, and (b) that the reader be able to reproduce and impart this knowledge to others in a clear and pleasing manner (N. Smith, Chapter 5). Expressive oral reading was taken as an indicator that the reader was accomplishing both those goals and by so doing was developing literary appreciation. Children taught by the word method during initial reading instruction were judged as not being able to read well in the upper grades primarily because, it was argued, they had not become independent word callers. The result was the adoptibn by some writers of an extremely synthetic phonics system which began not with the whole word but rather with the sounds of the letters. The program of instruction was hierarchically arranged from letters to syllables and from phonetically regular to phonetically irregular words. 1910-1925 A startlingly abrupt reversal from oral to almost exclusively silent reading occured during this period. The new emphasis was on meaning-getting in all phases of education. For the first time, research results were instrumental in changing reading instruction: Laboratory studies indicated that silent reading was superior to oral reading both in speed and comprehension. Thus, the emphasis on oral reading as the indicator of accurate acquisition of knowledge was swept aside along with the extensive phonics training which was believed necessary for achieving expressive oral reading. Experience charts were introduced and instruction focused on reading sentences, phrases, and words., The overwhelming emphasis was on skillful silent reading for meaning, with comprehension being measured in a variety of ways. During this period of 1910-1925, the term "remedial reading" first appeared. A paper by Uhl (1916) was apparently the first dealing with the idea that children having difficulty in reading could be diagnosed and given special teaching in a school setting. Uhl reported that each student received individual help 15 minutes daily for two to three weeks, with treatment "continued or modified during the training period according to the apparent needs of the pupils...Whi1e the number of pupils in each of the grades is small, yet the presence of a large relative gain in each grade upon the part of those drilled seems significant." Uhl drew a number of conclusions from the results of the study: (1) an accurate diagnosis is such that the specific defects of poor readers can be detected; (2) carefully planned individual treatment will produce as rapid growth as is produced in the case of apparently brighter students by class instruction; and (3) drill should be carried on during the school day by those who are experienced in teaching the subject. The Twenty-Fourth Yearbook of the National Society for the Study of Education (Whipple, 1925) reflected the concern with reading deficiency uncovered by the Army's tests of recruits during World War I. The Yearbook firmly endorsed the position that, "Silent reading is of largest social value...It is evident that there is need of vigorous emphasis in any program of instruction of habits of intelligent silent reading." Nevertheless, the authors cautioned that in beginning reading instruction and in diagnostic and remedial cases, certain "habits" must be given specific attention. These include recognition of sentences as units of thought, recognition of words and groups of words, and recognition and interpretation of typographical devices such as punctuation, paragraphing, etc. The objective of remedial work in the opinion of the Yearbook's authors was the removal of a deficiency by first analyzing the particu- lar causes of that deficiency. Their aim was to encourage teachers to make a systematic attack on the problem cases in their own classes. Individual examination via standardized tests, informal tests, and personal interview would hopefully reveal fundamental attitudes and causes of deficiency. Specific remedial measures attacking the cause of deficiency would be formulated and initiated, followed by measure- ment of progress and finally adjustment of work to changing needs until the deficiency was removed. In the field generally during this period diagnosis usually in- volved the compilation of a case history (school, home, medical); the administration of standardized reading tests, and observations of such motor factors as eye-movements, vocalization, extraneous bodily move- ments, and breathing. IO Remedial measures were of three different types: (1) psychologists were using special devices, including the kinesthetic method for teaching phonics; (2) medical people were using the alphabet-spelling method for "dyslexiacs"; and (3) educators concerned with instruction for retarded readers in the classroom were using a variety of pro- cedures to improve oral reading, silent reading, word recognition, and rate. They also emphasized methods designed to remedy inadequate eye movements, extraneous bodily movements, and improper breathing. How- ever, authors were criticized for making only general suggestions for diagnostic problems and not reporting remedial techniques in enough detail to make them useful. The overall conclusion was that remedial methods had not kept pace with diagnostic methods (Harris, 1967). Nevertheless, the diagnosis-treatment model itself remained undis- puted. The question of which, if any, diagnostic findings actually impacted the course of treatment was never raised. 1925-1935 Remedial reading was a chief subject of study during this period. (1) Educational clinics began to be established because it was perceived that the needs of many poor readers could not be uncovered using classroom diagnosis. (2) Batteries of tests for use in diag- nosing reading difficulties were developed. These included the Gates Reading Diagnosis Tests and the Durrell Analysis of Reading Difficulty. (3) Theories of organic causes of reading disability emerged including visual, auditory, and lateral dominance factors, heredity, mixed dominance of cerebral hemispheres, inadequate mental ability, and abnormal emotional tendencies (Harris, 1967). 11 Interestingly, during this period of continued emphasis on meaning and subordination of phonics, several major research reports dealing with severely disabled readers (Chall, 1967) concluded that the one characteristic common to almost all poor readers was word recognition and analysis errors. Thus, although theories of causation differed, the recommended remedial training focused on systematic phonics programs, often augmented by kinesthetic and motor aids. 1935-1950 While the cultural emphasis in reading shifted during this period (reading was now seen as necessary for living effectively in a democracy), the use of varied instructional approaches in reading continued, as did the interest in reading disability. A major effect of World War II on the field was the (re) dis- covery that thousands of military recruits could not read well enough to follow even simple printed instructions. Concurrently, a number of investigations revealed reading deficiencies in large numbers of high school and college students. Diagnostically, the major change according to Harris (1967) was the change from an organic-causation to a multiple-causation theory of reading disability. This pluralistic view of causation placed an emphasis primarily on visual, social, and emotional problems and secondarily on inappropriate teaching methods, neurological diffi- culties and speech and auditory difficulties. Developments in remediation included: (1) the use of informal disgnosis with basal readers; (2) establishment of large scale remedial programs in public schools, spreading from elementary schools, to 12 secondary schools and universities; (3) increase in the number of remedial reading teachers; (4) increase in the number of new books on the prevention and correction of reading difficulties; (5) development of mechanical reading pacers; and (6) experimental support for the view that motivated practice resulted in as much improvement as exten- sive use of special mechanical devices (Harris, 1967). As was the case for disabled readers during the silent reading period, research published during this period of limited reintroduction of phonics also reported that backward readers were unable to identify and analyze words (Chall, 1967). In these reported cases training was designed to overcome this specific instructional deficiency. Robinson (1946) paid little attention to phonics, or decoding, versus meaning issues. She used a team approach to diagnosis and remediation incorporating the findings of social worker, psychiatrist, speech correctionist, otolaryngologist, endocrinologist, and reading specialist. The findings on seriously disabled readers emphasized causal factors relating to maladjusted homes, visual anomalies, and social/emotional problems. Incorrect reading methods were fourth in importance. Other causal factors considered to be of lesser importance were speech/auditory functioning, endocrine problems, neurologic impairment, and general physical condition. Robinson's classic study summarized the treatment and follow-up provided for each student in the study based on what were considered to be probable causes of the disability. Almost all these seriously deficient students were reported to have made gains in social and emotional adjustment. 13 Progress in reading could be characterized as variable, ranging from no progress, through to excellent progress. Robinson commented that: Many of the anomalies in allied fields may be discovered and remedied without appreciable growth in reading. This is because such remediation only prepares the child for learning to read and does not teach him the skill. A direct and vigorous reading program must follow correction of causal factors...In many instances, it is probably that an enthusiastic, capable teacher can motivate a child to learn to read even though he has inhibiting difficulties, although much less time and effort might be needed if the inhibiting factors were corrected prior to learning (236-237). Robinson's study provided powerful support for the idea of multiple causality, and chiefly social-emotional causality of reading difficulty. Instruction, particularly in severe cases, was seen as necessarily following the removal of deficits in allied areas. While the usefulness of this approach for most children in most classrooms was highly debatable, preference for emphasis on treatment of reading problems by some form of therapy continued for a decade after the study was published. 1950-1965 Smith saw two major influences as shaping reading instruction during this period: Expanding knowledge in all areas of inquiry and the technological revolution. Extraordinary concern about the teaching of reading emerged. Reading was seen as the crucial skill needed to survive in a technologically complex world. No single method of instruction held sway but over-emphasis on phonics continued to be discouraged and emphasis on reading for meaning was reiterated. Also, 14 systematic instruction in comprehension skills was advocated as well as instruction in reading to suit the requirements of unique content areas. Interest in reading disability persisted during this period and numerous changes took place. Rudolph Flesch (1955) stated flatly that children couldn't read because they were being taught initially by the whole word method instead of the phonic method. He argued that the prerequisite for mature reading was mastery first of the mechanics of decoding and claimed that all research studies to that time supported that view. Flesch argued, in effect, that mature reading and learning to read were two different activities and that reading instruction was a fiasco because the word method forced children to grasp words as if they were experienced readers. Flesch claimed that strict attention to phonics was the only sensible way to conduct remedial as well as initial reading instruction and devoted the second half of his book to exercises that presented one, two and three syllable words in a care- fully graded hierarchy with each new level deriving from sound-symbol associations learned earlier. Harris believed that Flesch's book was beneficial to the field of remedial reading. He had convinced hundreds of thousands of parents that when Johnny had trouble with reading he was not necessarily stupid, and this led to public pressure both for improved develop- mental reading programs and for more diagnostic and remedial facilities (Harris, 1967). Nevertheless, the vast majority of clinicians and reading specialists continued to subscribe to the multiple-causation theory of reading disability using a variety of diagnostic tests, materials, and methods. Increased effort was made to obtain information related to 15 reading disability on the part of many investigators specializing in disciplines other than teaching: sociologists, psychologists, and physiologists, etc. Medical reSearchers studied the effects of hormonal imbalance and the results of administering certain medication to disabled readers, and the congenital word blindness theory re- emerged as one (not the only one as earlier) possible cause of reading disability. The emphasis on treating reading disability by some form of psychotherapy was gradually replaced by renewed emphasis on physio- logical, neurological, and constitutional factors. The renewed emphasis on phonics, however, also resulted in an outpouring of new remedial (as well as developmental) programs including the initial teaching alphabet, materials based on the same principles advocated by Flesch, programmed materials, programmed tutoring, and specific phonics systems. Overall, however, remedial measures again lagged behind diagnostic techniques. Additional training in perception, more attention to vocabulary and phonics and better parental guidance accounted for most of the newer emphases. 1965-1979 Amid the unresolved controversies concerning (1) the efficacy of various modes of instruction, (2) the validity of diverse causal factors, and (3) the long-term effectiveness of various remedial approaches, the Federal Government entered the scene. In 1965 Congress passed the Elementary and Secondary Education Act (ESEA) through which millions of dollars were earmarked specifically for 16 remedial reading for disadvantaged children. The new cultural influence held that the ability to read provided the key to economic survival and thousands of new positions in remedial reading were created and an immense program of diagnostic and remedial services was inaugurated. The instructional mode since 1965 could best be described as eclectic: basal, phonics, and language experience systems, individual- ized programs, trade books using controlled vocabularies, etc. So diverse was the instructional scene that Natchez (1968) edited a large volume specifically excluding discussion of any instructional systems, believing that, "There is no one method with which to deal best with the variety of problems pertaining to reading difficulties, but that effective treatment stems from the teacher's own interpretations of his chosen theoretical framework." The section on causation of reading disorder centered once again on emotional, neurophysiological, and cultural factors alone. The chapters dealing with diagnostic considerations presented some divergent positions but all agreed that a differential diagnosis must take into account the possibility of emotional, neurophysiological, language, motor, or perceptual disorders as having possible bearing on the reading disability. In the final section of Natchez' book, the authors dealt with classroom treatment. Without exception they had nothing whatsoever to say about the myriad causal factors and diagnostic considerations presented earlier in the volume. Rather, aside from routine statements about the necessity of attending to the motivation and personal 17 interests of the child, the authors described techniques or presented rationales for improving word attack and comprehension skills so that (1) reading could become pleasurable, and (2) subject matter areas could be mastered. Even those chapters which dealt with severe dis- ability did little more than reconmend some elaboration'of basic classroom techniques: (1) use of certain kinesthetic methods; (2) allotment of more time-to-mastery for word analysis and blending skills; and (3) provision of adjunctive services in psychotherapy. The blatant discrepancy between diagnostic categories and treat- ment options escaped any mention whatever. The ritual of stating that thorough diagnosis must precede treatment was followed, no matter that even on the face of it the remedial suggestions presented made vanishingly little use of any of the information in the prior diagnosis. At the same time that Natchez was urging teachers to approach diagnosis and remediation from the point of view of their own theo- retical framework, a new theoretic framework for approaching reading instruction was being proposed, a theory derived from the study of the interrelationships between thought and language: psycholinguistics. Goodman (1969) wrote that: The reader is viewed as a user of language who processes three kinds of information, grapho-phonic, syntactic, and semantic, as he reacts to the graphic display on the page. In comparing unexpected responses, the psycholinguistic reading process is revealed. ' ' In practice this meant that if an oral reading error preserved the sense of the message, even if the word was a variant of the actual graphic display on the page, the reader had not really made an error. Reading for meaning was paramount; code-breaking was secondary. 18 By 1973, F. Smith presented what he called "three radical in- sights:"- (1) Only a small part of the information necessary for reading comprehension comes from the printed page; (2) comprehension must precede the identification of individual words; and (3) reading is not decoding to spoken language. The instructional implications of this view were elaborated by Goodman (in F. Smith, 1973). Children were not to be put through a prescribed sequence of objectives or materials. The child is already programmed to learn to read. He needs written language that is both interesting and comprehensible, and teachers who understand language-learning and who appreciate his competence as a language-learner. Holmes (in F. Smith, 1973) put the matter succinctly: "Text can be comprehended only if it is read for meaning in the first place; reading to identify words is both unnecessary and inefficient." Smith (1978) was disinclined to make any specific recommendations about reading instruction. The question of the best method for teaching reading has, I think, not been answered because it is not the appropriate question; children do not learn because of reading programs but because teachers succeed in preventing programs from standing in children's way. It is the wisdom and intuition of teachers that must be trusted, provided that teachers have the theoretical foundation to reflect upon the decisions that only they can make. Despite Smith's (and Natchez') reliance on the wisdom of the properly trained practitioner, and despite the addition of psycho- linguistics to the armamentarium of approaches to reading instruction, the number of reading disabled children continued to climb during the period 1965 to the present (Satz, 1976; Guzak, 1978). 19 Once again in the remedial reading field there continued to be a disinclination, among many writers, to regard quality of instruction as a causal factor. It should be kept in mind that numerous specialists outside the field of reading had become involved in the search for causal factors believed to originate within children and their environ- ment. The field had defined and populated itself in a way that made other causes of reading difficulty (such as quality of instruction) appear as perhaps less fruitful areas of inquiry. Sources of causation of reading disabilities (Harris, 1977) continued to focus on multiple causation ineluding physiological, social,.and neurological components, while the learning disabilities movement emphasized deficiencies in one or more of the basic psychological processes involved in under- standing or using spoken or written language in children of normal intelligence, physical condition, environment, and emotional make-up. Satz (1976) wrote, "The.incidence of learning disorders is not decreasing, in fact, it may be increasing. Moreover, the effective- ness of current remediation or intervention programs is quite variable, if not discouraging." Satz believed there was an urgent need for early detection and intervention within a longitudinal context that incorporated both short and long-term gains. However, Harris (1977) pointed out that at present it is not yet known if any of the screening procedures used to predict reading failure specifically is more accurate in predicting failure than a good group reading test. He further considered that the number of false negatives (children incorrectly labelled as needing help at a future date) would be unacceptably high. 20 Yet another set of psychological processes that may cause reading disability was advanced by Downing (1973). He saw the motivation of the literacy learner as being of such importance that arguments could be made for making motivation the chief factor in success and failure in learning to read. "Clearly the place of literacy in a culture's priorities in education and the different ways in which these are implemented are bound to exert different motivational pressures on the learner" (p. 81). He pointed out that in the United States and France referrals of boys far exceeds that of girls but in Germany, Nigeria, and India the reverse is true. Downing attributed these disparities to cultural expectations for younger children. At the very least, it can be concluded that if there are any innate constitutional differences between girls and boys that effect their development of language and reading skills, they can be outweighted by other factors, as they must have been in countries like Germany, Nigeria, and India (p. 110). Finally, Downing made reference to the disagreement among experts internationally as to what is meant by the terms "reading" and "literacy." Elgin (1978) maintained that literacy is an attitude toward reading which sees the absence of books as being an unhappy event in one's life. By that criterion, she argued, the United States is an illiterate society. If Downing's stress on motivation is accdrate, and the purpose of reading in this country is not for personal enhancement, pleasure, or solace, then this would constitute an important causal influence on the attitude toward reading of students and teachers alike. The current statistics on reading disability in the United States are of awesome proportions: 15 percent of American schoolchildren 21 (eight million) are classified as having severe reading problems (Satz, 1976). Twenty percent of the adult papulation (23 million) are functionally illiterate, unable, for example, to read newspaper help- wanted ads. Less than half of the nation's total adult population aged 18 to 65 (55 million) are really proficient in reading skills (Guszak, 1978). "The present state of affairs is such that there can be no assurance that a diagnosis will be accurate or that remedial instruc- tion will be sufficient to meet a child's needs" (Satz, 1976). In a statement that could have been inserted into Uhl's article or the twenty-fourth NSSE Yearbook, Rutherford (1972) wrote: When a teacher explicates a child's reading problem in terms of reading skills that the child does and does not possess, and types of reading activities that he can and cannot perform, then the teacher has attained the desired diagnostic 1eve1--the prescriptive level. Rutherford did not discount the possibility that other causal factors (psychodynamic, sensory, neurological, etc.) were involved. He stressed, though, that, "Diagnosis of reading problems is useful to the classroom teacher only to the extent that it suggests what he can do to improve the child's reading performance." This reappearance of diagnostic and prescriptive reading instruction represents the most recent attempt to preserve the assumptions of the diagnostic-prescrip- tive model, namely that diagnosis directly informs the remedial process. For the most part, the recent texts and periodic literature that have dealt with the diagnostic-prescriptive model have finessed the problem of which approach to instruction might be “best." They have 22 emphasized instead that, whatever the approach, there will be children who don't progress adequately and that it is the classroom teacher who is responsible for correcting the situation. For example, Harvey (1974) and Bond (1976) both emphasized the role of the teacher and the teacher's knowledge about how to modify instruction as the key elements in the successful teaching of reading. Guszak (1978) assumed that if the diagnostic reading teacher knew the various reading skills (word recognition, comprehension, fluency), had a command of motivational techniques, and could determine accurately pupil possession of such skills, he or she had arrived at the point of prescribing instruction that would develop new behaviors. Guszak con- sidered diagnosis without specific prescription to be a wasted experience. Other writers continued to emphasize the role of the teacher and to de-emphasize the prominence of extended diagnosis per se. Rosen- shine (1970) and Mallaragno (1974) made very limited use of diagnostic evaluations. They both described programs in which it was assumed that children will improve in reading if given systematic, intensive instruction in skill development in both decoding and comprehension. Similarly, Stauffer, Abrams, and Pikulski (1978) believed that only when a well designed instructional plan was unsuccessful should a more comprehensive diagnostic evaluation be considered. A more moderate position vis-a-vis the usefulness of diagnosis was found in literature that referred to levels of diagnosis. Bond and Tinker (1967) and Ekwall (1976) believed that diagnosis must be efficient, going only as far as necessary. The remedial teacher studies the diagnostic findings and arranges a learning situation that 23 will enable the student to progress more rapidly than had been the case in the past. Dechant (1969) described therapautic diagnosis as being concerned with the present state of the child in order to determine correctable conditions in the child's reading. 0n the other hand, he was more skeptical of the usefulness of etiological diagnoses, believing that searches for causes of present difficulty were often useless for purposes of formulating a remedial program. Carter (1970) saw two major approaches to diagnosis and treat- ment: Clinical and corrective. Both are matched to treatment but the corrective approach is classroom based and does not take into account causal factors. The clinical approach includes causal factors. This dichotomy is similar to Dechant's but is more sympathetic to the usefulness of uncovering etiological factors in reading disability. Harris (1977) pointed out that skill development via the diag- nostic-prescriptive model has brought with it a variety of management schemes. Objectives are stated, pretests and posttests and developed for each objective, and instructional materials are prepared. A child's assignments are based on pretest performance. If posttest mastery is not demonstrated, additional assignments are given. Harris cautioned that: The temptation to stress highly specific goals which are easy to test sometimes distorts the program into a great over- emphasis on decoding. Much time is probably wasted in rigidly following the idea of a pretest for every objective. There is a strong temptation to go by the number of correct answers and not inquire how or why the child made his errors, so that diagnostic thinking is at a very low level. 24 Nevertheless, Harris appeared to be convinced that an effective diagnostic-prescriptive program would result in fewer children requiring outside help. The reading specialist would act primarily as a resource person and helper to the teachers and would also work with the few children who required intensive and eXpert help. During this same time of 1965 to the present, however, a few authors questioned the existence of any significant relationship between diagnosis and remediation, whether the causal factors were seen as being neurological, psychological, instructional, etc. Strang (1969) wrote, "Research on the diagnostic process itself is practically non-existent. There is little evidence that the recommended remedial techniques, if used, would be effective, or that techniques not recom- mended would get equally good results." Spache (in Newman, 1968) con- tended that there was still widespread 1ack of integration between the two processes of diagnosis and remediation and stated that, "Numerous reports of remedial work give evidence that the procedures used are not directly related to the detailed diagnostic findings." Bateman (1971) asked: Was the diagnosis a necessary and sufficient prerequisite to the remediation? Might other remediation, not derived from the diagnosis, have been equally successful? A child is diagnosed but the remediation is not successful. Was the diagnosis inadequate, or was an error made in deriving the remediation? In any case, it is difficult to establish a valid relationship between diagnosis and remediation. She concluded: If the diagnostic-remedial approach is found valid, what then is the relationship between diagnosis and remediation? Is effective remediation geared to the child's strengths, weak- nesses, neither or both? Almost all have tacitly agreed that diagnosis and remediation ought to be related, either directly 25 (teach to the strengths) or inversely (teach to the weakness). Unfortunately, there is as yet no direct evidence to support the efficacy of such a matching procedure. Robinson's study still remains the only long-term investigation in which diagnosis of causal factors in reading disability (primarily emotional) was associated with specific treatment (primarily psycho- therapeutic) and the results monitored over a long period of time. She demonstrated, in my view, that treating people with reading problems by providing professional psychological support resulted chiefly in enabling these people to cope with their lives more adequately but did not necessarily result in their becoming better readers. Similarly, when 25 years later auditory/visual perceptual deficiencies were seen as causing reading disabilities, children remediated with direct teaching of perceptual skills increased their scores on tests that measured this ability but did not increase their reading scores over those of a control group (Guthrie and Siefert, 1978). Not only in the field of reading was the tacit agreement that diagnosis and remediation are closely related sometimes questioned. Kendell (1975) wrote that: In most branches of medicine the value of diagnosis is never questioned. Its importance is self-evicent because treatment and prognosis are largely determined by it...Where mental illness is concerned, the situation is rather different. The therapeutic and prognostic implications of psychiatric diag- noses are relatively weak, and the diagnoses themselves are relatively unreliable. Background of the Current Study The diagnostic-treatment model in remedial reading has been the dominant one since the beginning of the field. The explanations of 26 what the two tasks ought to encompass (e.g., multiple causality/treat- ment in allied fields; instructional causes/skill training, etc.) have varied over the years but the uhderlying assumption has not: diagnosis is an absolute prerequisite to treatment. There have been researchers who questioned this assumption, but until now no empirical evidence has been available either to support or challenge this central assumption. A first attempt empirically to examine the extent of the relation- ship between diagnosis and remediation was a pilot study (Weinshank, 1979) which utilized data gathered during an observational study (Vinsonhaler, J., 1979)" in which experienced clinicians were asked to write a diagnosis and remediation after examining three simulated cases (SIMCASES) of reading difficulty over a three week period. Two of the cases were superficially disguised replicates of one another. The major finding was that there was, on the average, a very slight positive association between problems listed in diagnosing the two essentially identical cases. The average correlation for repli- cated cases was 0.14 (range of 0.00 to 0.30). This measure of consis- tency represented the degree of reliability with which a clinician arrived at the same diagnosis for the same case. The major results of the pilot study were: 1. There was a 0.40 correlation between problems stated in a given diagnosis and treatments proposed in a given remedi- ation. The two acts were considerably more independent than had been expected. 2. A very small number of remediations was used repeatedly across cases seemingly irrespective of prior diagnosis. 27 3. Clinicians who were more reliable diagnostically also tended to be more reliable in terms of remediation plans. Closer examination of the written diagnoses from the 1977 obser- vational study revealed that not all diagnostic statements were state- ments of problems. Diagnoses could be broken down into three sub- categories: (1) problems (or weaknesses), e.g., cannot apply phonetic analysis independently; (2) strengths, e.g., uses context well; and (3) observations,;e.g.,-he's.the middle-child in the family. Closer examination of the written remediations revealed that not all remediations were statements of treatments, just as not all diag- noses were statements of problems. Remediation plans were composed of: 1. Problem (re)statements e.g., his verbal ability indicates he will have difficulty in learning to read. 2. Statements of strength e.g., has high average ability and some basic skills. 3. Observations e.g., he seems extremely inte- rested in sports. 4. Treatment statements e.g., help the student become familiar with cv/vc syllabication rule. 5. Problem-treatment e.g., help reversal problem statements pointing out differences in similar words. 28 Treatment-prescription e.g., build sight vocabulary by statements using flashcards, word blanks, - beat-the-clock, and by incor- poration into a short story. Prescription statements e.g., use language experience stories. The four major remediation categories, then, were (1) problems, (2) strengths, (3) observations, and (4) treatments. A number of hypotheses were generated by the results of the pilot study: 1. Diagnostic statements of weakness (problems) are repeated more reliably across sessions for a case than are statements of strength and observation. Statements of treatment are more reliably associated with diagnostic problems than with strengths or observations. Diagnostic statements to which remediations are attached are repeated more reliably across sessions for a case than are the unremediated diagnostic statements. Inconsistency in the association of treatments in the initial remediation plan with problems in the diagnosis was hypothesized to arise from two sources: 1. The clinician does not associate a problem in the diagnosis with any remediation. Possible explanations are that the clinician forgot to include a treatment for the problem and/or the clinician doesn't know of any appropriate treatment and/or treatment is not called for. 29 2. The clinician does not associate a treatment in the remedi- ation with any problem in the diagnosis. Possible explana- tions are that the clinician never articulated the problem in the diagnosis and/or always uses this treatment irrespective of diagnosis. The present study was designed to test the observations and hypotheses generated by the preliminary results of the pilot study and to provide more definitive information about the connections clinicians make between remediation and diagnosis. CHAPTER THREE DESIGN AND METHOD Introduction The aim of this study was to determine whether there was a reliable relationship between diagnosis and remediation in reading. Preliminary data collected during a pilot study (Chapter Two) suggested that only a modest relationship existed between the two tasks. In order to replicate and account for the observed lack Of correspondence, a series of precise questions and hypotheses was formulated for this investigation. 1. How reliable, or-consistent,gare clinicians' diagnoses? Data from earlier studies conducted by the Clinical Studies Group (Chapter One) showed that the diagnostic reliability of reading clinicians' diagnoses was very low. Hypothesis: Diagnostic statements of weakness (problems) are repeated more reliably across sessions for a case than are statements of strength and observation. How reliable are their remediations? The pilot study indicated that only a small number of treatments was used repeatedly. Might remediation be somewhat more reliable than diagnosis? How reliable are information collection procedures? That is, did the clinicians tend to choose the same case information repeatedly? 3O 31 How reliable are diagnostic statements to which remediations are attached? That is, even if over-all diagnostic reli- ability were shown to be low, was increased reliability demon- strated for those particular diagnostic statements which were labeled as requiring remediation? Hypothesis: Diagnostic statements to which remediations are attached are repeated more reliably across sessions for a case than are the unremediated diagnostic statements. How reliable are specific remedial diagnostic associations? In other words, are similar diagnoses paired repeatedly with similar treatments? Hypothesis: Statements of treatment are more reliably associated with diagnostic problems than with strengths or Observations. What is the correlation between diagnostic and remedial per- formance?. The pilot study indicated that those clinicians who exhibited relatively higher reliability in their diagnostic performance also tended to have higher remedial reliability. Does institution attended effect performance? 00 clinicians trained in institutions widely separated geographically perform the task of diagnosing and remediating reading difficulties in significantly different ways? How do the reading specialists themselves view the tasks of diagnosis and remediation? Does their performance on the tasks presented to them during the study conform to their own descriptions of what they do? 32 Specifically, the procedures were as follows: 1. Categorization and standardization of diagnostic statements were obtained by having the clinicians transfer their written diagnoses to a standardized diagnostic check list. 2. Categorization and standardization of remedial statements were Obtained by having the clinicians transfer their written remediations to a standardized remedial check list. 3. Associations between remedial and diagnostic statements were Obtained from the clinicians. 4. Specific information was Obtained about: (a) matched diagnostic-remedial statements; (b) unmatched remedial statements; and (c) unmatched diagnostic statements. Research Settigg_ The Observational sessions were conducted in a small room con- taining chairs and a table. The session involved the experimenter and the reading specialist (clinical subject) together with the simulated case of reading difficulty and attendant data-gathering materials. The experimenter's major tasks were to oversee the entire session, including providing necessary directions, presenting all simulated case materials requested by the subject as well as providing other material required for data collection, audio recording selected portions of the sessions, and aSkdng questions of the clinical subject at specified times during the session. The clinical subject's major tasks were to collect infor- mation concerning the case as she normally would, prepare a written 33 diagnosis and remediation, and explicitly pair remediation with prior ‘diagnoses. Design of the Study The design Of the study called for random assignment Of subjects to SIMCASES. Each simulated case was based on a real child who was served by the Reading Clinic at Michigan State University. The problems represented in the four cases (all were males) were considered to be representative Of those commonly encountered in the public schools. For each SIMCASE a cue inventory was provided listing the data (cues) available for that case (Appendix A). Each SIMCASE had its equivalent form--a superficially disguised replicate of the original prepared by making minor changes in the original data base, and randomly re-ordering the cue inventory (Lee and Weinshank, 1978). i The data in any given SIMCASE included: information about achieve- ment tests, family background, cognitive ability, group and individual reading diagnostic measures, classroom information, etc. The information was presented in a variety Of formats: test book- lets, audio tapes, examiner's comments, and test scores. All cases contained a taped audio interview with the student, a brief statement Of the reason for referral tO the Reading Clinic, and an artist's sketch of the child based on hearing a taped interview. The overall design of the study--random assignment Of subjects to cases and cue inventory forms--is shown in Table l. Table l: 34 Overall Design Of the Study Session 2 Teacher Session 1 Session 3 A Case 1, F. Case 4, F.1 Case 1 (Replicate) F.2 8 Case 2, F. Case 3, F.1 Case 2 (Replicate) F.2 C Case 3, F. Case 1, F.2 Case 3 (Replicate) F.1 0 Case 4, F. Case 2, F.2 Case 4 (Replicate) F.1 E Case 4, F. Case 2, F.2 Case 4 (Replicate) F.1 F Case 3, P. Case 1, F.2 Case 3 (Replicate) F.1- G Case 2, F. Case 3, F.1 Case 2 (Replicate) F.2 H Case 1, F. Case 4, F.1 Case 1 (Replicate) F.2 F.1 and F.2 indicate which of the two forms of cue inventory was used. Table 2 sumnarizes the content of each session. A minimum of one week separated sessions one and three. Tab1e 2: Summary of Contents of Sessions Session 1 Session 2 Session 3 Work with one of the four original SIMCASES Work with one of the remaining three original SIMCASES Work with equivalent form (replicate) Of the Session 1 SIMCASE The Subjects The eight subjects in the study (all females) were teachers who had an earned master's degree in reading and a minimum of four years Of reading-related teaching experience. 35 In order to account for the possible influence of academic training on diagnostic/remedial performance, two groups Of specialists were chosen: those who received their entire undergraduate and graduate training at universities in Michigan and those who received their entire undergraduate and graduate training at universities in the Chicago, Illinois area. There were four reading specialists in each group. Since data collection was a lengthy procedure requiring three three-hour sessions for each subject, all teachers received payment of $90.00 for their participation in the study. The 12 sessions with the Chicago-area subjects were conducted over a three week period at the Department Of Reading, Northeastern Illinois University, Chicago. The 12 sessions with the greater Lansing area subjects were conducted at the College of Education,-Michigan State University, East Lansing. None of the clinicians was familiar with the previous work con- ducted by the Clinical Studies Group Of the Institute for Research on Teaching. A profile of the teachers who participated in the study is presented in Table 3. 1191:1122 The procedures for sessions one and two were identical. Session ’three differed only in that a final debriefing was added during which the subjects were asked to reflect on some of their decisions. The task of the subjects in all sessions was to diagnose the SIMCASE, propose an initial remediation plan, and associate remedial and diagnostic statements. Table 3: 36 Profile Of Participating Subjects Teacher Yrs. of Most Recent Grades Current Res- Teaching Degree . Taught ponsibilities 1 10 Ed. Specialist Elem./Jr.High Reading Consultant 2 18 Masters Elem./Jr.High Reading Specialist 3 7 Masters Elementary Supportive Reading 4 27 Masters Elementary Remedial Reading 5 9 Masters Elem./Jr.High Reading/ Lang. Arts 6 4 Masters High School Remedial Reading 7 8 Masters Primary Reading Resource 8 9 Masters Primary All Subjects The sequence of instructions is detailed in Appendix B. 1. Briefly: The teacher was given an introduction about the study, followed by instructions regarding payment and granting informed consent. questionnaire was completed by the subject. During the first session only, a background An overview of the three sessions was read and the subject, using a training SIMCASE, practiced using the cue inventory to request infor- mation about the case. After the brief practice session, the teacher was shown the referral statement and sketch for her assigned case and 37 listened to the taped interview. She was then allowed up to thirty minutes to request information using the cue inventory, taking notes if she wiShed. The cue requests were entered on a standard form together with time Of request (Appendix C). The session was audio taped as well. When the thirty minutes were over, the teacher was given up to twenty minutes to write her diagnosis, and up to another twenty minutes to write an initial remediation plan. Wide-lined paper was provided, and all notes made and cues collected were at hand while the diagnosis and remediation were being written. After a short break, the subject worked with her written . diagnosis according to the following protocol: (a): Using a sample diagnosis as a.guide (Appendix D) the subject was given an opportunity to practice circling, numbering, and coding diagnostic statements. The codes were: S for strength, W for weakness, and Obs. for observation. (An Observation was defined as a neutral statement which characterized the case, e.g., the child has two siblings.) (b) After completing the practice sample, the subject repeated the task with her own diagnosis. (c) Using a sample diagnostic cheeklist as a guide, the subject practiced the procedure for transferring the sample numbered, coded diagnostic statements to the checklist. 38 (d) After completing the practice, the subject repeated the task with her own diagnosis using a fresh diagnostic checklist (Appendix E). The protocol for working with the written remediation paralleled that for the diagnosis with two differences: (a) A sample remediation was used to practice circling, numbering, and coding remedial statements (Appendix F). The codes Were: S for strength, W for weakness, Obs. for Observation, and T for treatment. (b) Numbered, coded remedial statements were transferred to a remedial checklist (Appendix G). After the diagnosis and remediation checklists were completed the clinician directed her attention to her written diagnosis while the experimenter read aloud to the subject each numbered remedial element in turn. For each one, the subject indicated which (if any) diagnostic elements she associated with the remedial element. She then read the number(s) and code(s) Of each associated diagnostic element. The experimenter wrote down all associations on a standard form (Appendix H), which was formatted differently for use in Session 3 only (Appendix I). The session was audio taped. Unmatched remedial element numbers and unmatched diag- nostic element numbers were entered on a separate form (Appendix J). Sessions 1 and 2 ended at this point. Session 3, however, continued from this point, following a short break. 39 TO complete session 3 the clinician was asked to reflect on the recommendation/diagnosis associations just completed. Two questions were asked about each association: (1) Why did you choose this remediation?, and (2) What is the association between the remediation and the diagnosis? The questions and alternative responses were listed on a guide sheet (Appen- dix K), and the experimenter noted responses on the remedi- ation/diagnosis association form used in Session 3. If any remedial elements were left unassociated the subject was asked to look at the written remediation. The following statement was made by the experimenter for each remediation: Remedial element ..... is not associated with a diagnostic element because ..... The statement and alternative responses were listed on a guide sheet and the experimenter noted responses on a standard form (Appendix L). If any diagnostic elements were left unassociated, the subject was asked to look at her written diagnosis. The following statement was made by the experimenter for each diagnosis: Diagnostic element number ..... is not associated with a remedial element because ..... The statement and alternative responses were listed on a guide sheet and the experimenter noted responses on a standard form (Appendix M). The entire debriefing was audio taped and the subject was given the opportunity to comment on all responses made. TO close the session, the subject was asked to respond to three Open-ended questions. 40 (a) What guides you in gathering information about a child and in determining what should be written in the diagnosis? ‘ ‘ (b) What do you see as being the relationship between diagnosis and initial treatment plan? (c) 00 you have any comments about the session? All responses were audio-taped. Data Analysis During the course of each Of the 24 observational sessions, sub- jects collected case information (cues), wrote a diagnosis and remedi- ation and transferred both to standardized checklists, and associated remedial with diagnostic statements. These procedures resulted in five products being collected for each session: a cue list, diagnosis, remediation, remediated diagnoses, and specific remedial/diagnostic associations. These products were analyzed on a case-by-case basis (six sessions for each case). The data was to be analyzed in order to determine: 1. Agreement among and between clinicians on (a) Diagnostic statements (b) Remedial statements (c) Information collected (d) iRemediated diagnostic statements, and (e) Specific remedial/diagnostic associations; 2. Strength of correlations between various products; and 3. Any differences in performance between clinicians trained in different institutions. 41 The standardized checklists were the source of data for the analyses. They provided a common metric for making comparisons across sessions. The diagnostic and remedial checklists were derived from diagnostic and remedial statements made by reading specialists during the course of previous observational studies conducted by the Clinical Studies Group. These statements were categorized and redundancies eliminated. There were 101 statements in both lists grouped into a number Of categories. Each diagnostic statement could be classified as a strength, weakness, or observation. Each remedial statement could be classified as a strength, weakness, Observation, or treatment. Statements from the written diagnosis and remediation which subjects felt could not be matched to domain statements were written in, either under "Other" within one of the existing categories, if possible, or under a separate major category labelled "OTHER." Subjects were asked to make as limited use as possible of "Other/OTHER" statements. The formal analyses did not include these statements since they were not a part Of the fixed diagnosis/remediation domains. Analysis Of "Other/OTHER" statements was dealt with separately. The cue domain for each case was simply a sequentially numbered list of all the cues present in that case. Each cue requested during a session was noted. In preparing data sheets, the appropriate domain number was placed next to the requested cue. The agreement statistics were calculated using (1) a computer statistical analysis system (Observational Study Data Analysis System) 42 developed and maintained by the Institute for Research on Teaching. Michigan State University, and (2) an interactive statistical analysis system (MIDAS) at the University of Michigan. A random sampling of data was computed by hand across numerous Observational studies, and the system was found to be operating accurately. Agreement Statistics (1) Proportional Agreement In order to determine the extent of agreement for each diagnosis/remediation/cue for a given case, a proportion was com- puted for each statement in the domain fOr diagnosis, remediation, and cues. If a statement received a value of zero (0.00), it was not mentioned during any of the six sessions for the case. If a statement received a value of one (1.00), it was mentioned during all six sessions. For example, if the statement, "Basic Sight Words, Weakness," was mentioned in three of the six sessions, the proportional agree- ment would be: NO. of sessions during which the statement was made P.A. = Total nO. of sessions 3_ P.A. = 6 P.A. = 0.50 43 (2)Commonality The clinician's products in a session were compared with those from the other five sessions devoted to the case in order to assess how similar was one clinician's performance to the total group. Any statement made in two or more sessions (out of six possible) was included in the domain used for calculating commonality. For example, assume there were twenty statements in a diagnostic domain for a given case. A clinician who had mentioned fifteen Of those state- ments in her individual diagnosis would have a higher commonality score than one who had mentioned only six of those twenty domain statements. The performance of the first clinician would be more similar to that of the group than that Of the second clinician. (3)Intercorrelations a) Inter-Clinician Correlation. The diagnostic/remedial/cue (Dx/ Rx/Cx) categories mentioned by one clinician were compared with those mentioned by a second clinician for the same case. A Phi correlation, a measure of inter-clinician agreement, was computed for each pair of clinicians (Appendix N). The comparison is summarized in the grid below. b) Intra-Clinician Correlation. The diagnostic/remedial/cue categories mentioned by the clinician at time one (T1) were compared with those mentioned by the same clinician for the replicate form of the same case at time two (T2). A Phi correlation, a measure of intra- clinician agreement, was computed for each pair of sessions for given 44 clinicians (Appendix N). The comparison is summarized in the grid below. Clinician A, SIMCASE Q PRESENT (+) ABSENT (-) Frequency count of Frequency count of statements in the statements in the domain present in domain present in ‘7 '5' both clinicians' clinician B's session a $7: sessions ' but not in clinician 5 33"“ Diagnosis/Remediation/ A's Diagnosis/Remedi- 35 P- Cues ' ation/Cues “P A B a; C: :2 Frequency count of Frequency count of .2 statements in the statements in the .5 domain present in domain absent in :3 '5 clinician A's session both clinicians' 5,"? but not in 8's Diagnosis/Remediation/ 2" Diagnosis/Remediation/ Cues Cues C D The presence of a large percentage Of statements (more than 85%) in the "D" cell (the statement is absent in both sessions) artificially inflated the intercorrelations since it represented, in effect, . agreeing to disagree. A statistic developed by Professor A. Porter (Institute for Research on Teaching, Michigan State University) was designed to correct for this occurrence, by including in the computation only the values in the A, B, and C cells I—;—%—;-E- All inter- correlations were calculated twice (including and excluding the 0 cell) so that the impact Of cell size could be seen. 45 Clinician i SIMCASE Q, Form One PRESENT (+) ABSENT (-) Frequency count Of Frequency count of statements present statements in the . in the domain in both domain present in c: sessions for form one SIMCASE form two but u: and form two of not in SIMCASE form 3 SIMCASE one x: o» A B - 3 v: F- 5 5 Frequency count of Absent in both ~5, statements in the sessions for form '2 domain present in one and form two :5 the session for of SIMCASE L) SIMCASE form one but not SIMCASE form two C O Additional Analytic Procedures Pearson Product Moment Correlations were computed for various pairs Of products in order to determine whether performance on one product could be considered to be predictive Of performance on another. Finally, t-tests for differences between means were computed to establish if the two groups Of subjects, Illinois and Michigan, per- formed significantly differently. Summar The primary Objective of this study was to establish the nature and extent of the relationship between diagnosis and remediation in reading. TO obtain the necessary data, eight experienced, creden- tialled, practicing reading specialists, four from Illinois and four from Michigan, each participated in three sessions during which they 46 performed certain tasks using simulated cases of reading difficulty. Two of the cases were equivalent forms of the same problem. The third represented a different problem. The products that resulted from each session were analyzed to determine (1) extent Of agreement among and between clinicians (propor- tional agreement, commonality, intercorrelations); (2) strength Of correlations between products; and (3) any differences in performance between Illinois-trained and Michigan-trained subjects. CHAPTER FOUR RESULTS Introduction The major purpose Of this study was to provide an objective measure of the nature and extent of the association between diagnosis and remediation in reading. Eight reading specialists working in a controlled setting provided the data to be described in this chapter. Specifically, in a total of twenty-four sessions they requested information from a simulated case of reading difficulty and: l. wrote a diagnosis for a SIMCASE of reading difficulty; 2. wrote an initial remediation plan; 3. tranferred diagnostic statements to a standardized diagnostic checklist and remedial statements to a standardized remedial checklist; 4. made associations between remedial and diagnostic statements; 5. reported on reasons for making the associations; 6. reported on reasons for the presence Of unassociated diagnostic elements (if any) and unassociated remedial elements (if any); and 7. discussed their conduct Of the diagnosis/remediation procedure. 47 48 The data analytic results will be reported first for the three kinds of agreement described in Chapter Three: 1. Proportional agreement (the number of times a statement was mentioned across the six sessions for each case); 2. Commonality (the extent to which an individual mentions the same statements as did the group for a given case); and 3. Intercorrelations (the extent Of agreement on statements characterizing a given case between a clinician with herself and with another clinician). Further results to be reported are Pearson Product Moment Corre- lations between products (e.g., between diagnosis and remediation, cues and diagnoses, etc.) and t-tests for differences in mean performance between Illinois and Michigan subjects. The reading specialists in this study examined four simulated cases of reading difficulty which presented infOrmation about real children with reading difficulties. The children's grade placement ranged from grades three to seven. Case One was a third grader with (1) a significantly depressed sight vocabulary, (2) a serious problem with decoded word recognition skills, (3) inadequate fluent message segmentation, and (4) no sig- nificant problem comprehending anything he could read or listen to. Case Two presented a sixth grader who exhibited (l) significantly retarded sight vocabulary and word recognition behaviors, (2) a sig- nificant problem with analysis on multisyllabic words coupled with poor visual segmentation and high frequency hearing loss, (3) diffi- culty phrasing material of higher level semantic content, and 49 (4) comprehension problems related to the demands Of higher level content-related materials. Case Three presented a seventh grader with (l) reasonably intact sight vocabulary and word recognition performance, (2) significant problems with higher level decoding skills and their application during reading, (3) a serious problem with fluent texting Of material, and (4) a comprehension problem specifically related to the meaning demands Of content-related materials. Case Four was a fourth grader who showed (1) a significantly depressed sight vocabulary, (2) a significant problem with the learning and application of decoded word recognition skills, (3) no major problem with text segmenting, and (4) no language-based comprehension problem. In order to give the reader a sense of the data used for comparing clinician performance, several diagnoses and remediations written by clinical subjects in the study are presented for Case Three. Clinician A's reliability (her agreement with herself on the same case) was in the lowest third of the range; clinician B's was in the highest. For convenience, their numbered diagnostic and remedial statements are presented serially instead of in the prose form in which they were actually written. The code letters assigned to each diagnostic state- ment are: (s) for strength, (w) for weakness, (obs) for observations, and (t) for treatment. A master diagnosis and remediation plan compiled by a senior reading clinician on the faculty of Michigan State University is presented in Appendix 0. 50 Diagnosis: Clinician A 1. 2. OWNO‘U‘Ibw 10. 11. 12. 13. 14. 15, 16. 17. 18. 19. 20. 21. 22. He has poor study skills. (w) His lack of concentration and interest could have led to his poor study skills. (w) He is not well liked by students or adults. (w) He lacks motor control. (w) He has a history Of speech problems. (w) His oral reading was performed word by word. (w) He repeated phrases. (w) He also had irregular pauses. (w) His tone was a sing-song type and sometimes choppy. (w) This made a very unpleasant auditory interpretation of the selection. (obs) His attitude towards reading aloud was negative. (w) His comprehension during his oral performance was good. (5) He did not encounter much difficulty until selection 8. (5) His 1.0. was average. (5) His silent reading was lacking for his score was a high fourth grade. (w) His listening comprehension was at grade level. (5) His word recognition was almost at grade level. (5) His word analysis was almost at grade level. (5) His spelling was a low third grade level. (w) His spelling problems were in all three areas: syllabication, (w) vowels, (w) and consonant substitutions and omissions. (w) 23. 24. 25. 26. 27. 28. 29. 30. 51 His auditory test indicated some hearing loss. (w) This may have led to some of his speech and spelling problems. (obs) . Even though he has poor motor skills, his main interest is sports. (obs) His poor motor control could be the cause of his poor and irregular handwriting. (w) His word analysis was nearly at grade level. (5) This indicated he was better at auditorily producing the sounds rather than being able to write the letter names for the sounds. (obs) He used beginning sound clues when encountering unknown words, (5) but Often missed medial and ending syllables. (w) Remediation: Clinician A 1. First we are going to have to work on his attitude towards reading. (t) Providing him with purpose and worth for reading will help. (t) Because he had a desire to lead and win friendship, maybe some instruction on reading for information so he could converse and be interesting to others will help. (t) Reading topics of his interest and others who are close to him will also help in providing a purpose for reading. (t) Because of the unpleasant auditory interpretation of selections, he should also work on oral reading skills. (w) 10. 11. 12. 52 Reading selections silently and thinking about the subject matter they contain may help. (t) Listening to tapes and then following the listening session trying to imitate their expressions and phrasing would also help. (t) Practicing a selection so as to be able to perform it as a dramatic reading may also help prevent his uninterested tone while reading aloud. (t) More practice with silent reading would help. (t) Questions preceding the silent reading will motivate him to be reading for information. (t) Spelling was a severe deficiency. (w) Cards containing the correct letters in misspelled words and spaces where the incorrect letters were would get him to think about the unknown parts of words rather than studying the whole word. (t) Diagnosis: Clinician B 1. 0301-wa His is of average 1.0. (5) He comes from a good home. (5) He appears to have a normal sibling relationship. (5) He suffers from both an auditory (w) and visual problem. (w) His auditory test showed a marked acuity problem in the left ear. (W) ' His visual test showed that he was farsighted. (w) Emotionally he seems to suffer from a school problem. (w) 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 53 His teacher felt that no one liked him, and she had very negative comments about him. (w) He was agressive and was generally found to be not too pleasant a person. (w) The parents didn't see a social problem for him. (obs) They were more concerned with a speech (obs) and a motor-coordination problem. (obs) They claimed he was shy which doesn't quite agree.with his teacher's report. (obs) He tested on the Durrell at mid high fourth grade on the oral (w) and a high fourth on the silent portion. (w) His comprehension oral was good. (5) His written comprehension was good, and the examiner noted good recall. (5) He frustrated at the sixth grade level. (w) His rate lagged behind on both the Durrell and the Gates. (w) His word recognition was excellent. (5) He had some problems on base words. (w) Otherwise, his recognition of blends (s) and syllables was good. (5) His main problem seems to be one of comprehension (w) and rate. (w) He could be a non-context reader as evidenced by the high vocabulary score on the Gates. (w) He is not phrasing well. (w) This could mean he is not reading for context. (w) 54 Remediation: Clinician B 1. 01-th 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. In remediating this problem first get a complete visual (t) and hearing test readminiStered. (t) Check into his school situation. (t) and determine how he is doing socially with his peers. (obs) School relationships are very important when we look at academic progress. (obs) Also check the home situation and discuss the school problem with the parents. (t) Remediation should center on comprehension. (t) He is approximately two years below level (w) but is not too deficient in word recognition skills. (5) Since his recall was good at fourth grade level, (5) I would work on higher level comprehension skills. (t) Give him books of interest and at a readability he can cope with. (t) DO not ignore word recognition but do not focus on it. (t) He had some problems in structural analysis (w) and in recognizing base words, (w) so I would give him work with word parts. (t) Blends could be worked on using various phonics exercises. (t) I would teach him to use context. (t) Give him exercises on reading for a purpose. (t) His rate should improve as he begins reading for a purpose. (obs) Rate must only improve with comprehension. (obs) 55 22. Various timed exercises with comprehension checks can be useful here. (t) Proportional Agreement (1) Diagnosis Across the four cases, a total of 304 diagnostic statements was made. Analysis of these statements revealed that barely one-third were mentioned in two or more sessions, as shown in Table 4. Table 4: Proportion of Diagnostic Statements Mentioned in Two or More Sessions Case One 0.32 w Case Two 0.42 Case Three. 0.35 Case Four 0.33 The two sets of categories--those mentioned in two or more sessions and those mentioned only once had different characteristics. The proportion of statements coded as being a weakness (as opposed to strength, or observation) was substantially higher for the categories mentioned in two or more sessions than for those mentioned only once. Thus, the more idiosyncratic statements which made up the bulk of the diagnostic categories for all the cases tended to revolve more around statements of strengths and observations. Those statements made more often across sessions tended to focus more sharply on problematic aspects of the child's reading performance. 56 Table 5: Proportion of Weakness Statements in Set of Diagnostic Statements Mentioned Once Mentioned in Two or More Sessions Case One 0.32 0.66 Case Two 0.24 0.44 Case Three 0.49 0.69 Case Four 0.34 0.44 Looking across the four cases, 33 specific diagnostic statements were mentioned in two or more sessions. They clustered within nine categories as follows: 1. 2. acumen INTELLECTUAL POTENTIAL (verbal, nonverbal); WORD RECOGNITION (general, basic sight words, use of letter order); ORAL READING (general, rate, fluency, phrasing, punctuation, substitutions); WORD ANALYSIS (general, use of letter/sound associations, syllables, root words, suffixes, ability to blend, use of blends, initial consonants); HEARING (general, acuity, discrimination); VISION (acuity); SILENT READING (rate, comprehension); COMPREHENSION (general, silent, oral, listening); and 57 9. AFFECTIVE (general, attitude, relationship with peers, emotional adjustment, home environment). To summarize, about one-third of the diagnostic statements made were agreed upon as being characteristic of a given case. Those state- ments were more apt to be coded as being weaknesses rather than strengths or observations. Diagnostic statements mentioned only once were drawn from across the entire domain. They were more apt to be coded as strengths or observations (Appendix P). Across all four cases, combinations of 33 diagnostic statements were mentioned more than once (Table 6). (2) Remediation A total of 270 remedial statements was made across the four cases. Analysis of these statements showed that the number of remedial state- ments mentioned in two or more sessions as compared to the total number mentioned for each case was lower than that obtained for diagnostic categories. Table 7: Proportion of Remedial Statements Mentioned in Two or More Sessions Case One 0.34 Case Two 0.40 Case Three 0.15 Case Four 0.12 58 Table 6: Agreed-Upon Diagnostic Statements Across Cases Cases I Statements >< ><><><><><>< ><>d>< x X X X Intellectual potential X Word recognition: general X sight reversals X Oral reading: general X rate ' punctuation X substitutions contextually acceptable x phrasing fluency Word analysis: general suffixes Tetter/sound associations initial consonants blendingyability syllables use of blends use of root words XXX xxx x XXXX X Hearing: discrimination X general acuity x Vision: acuity Silent reading: rate comprehensibn Comprehension: oral general silent listening XXXX XX X X ><><><>< Affective: general attitude toward reading home environment X relationship with peers X adjustment ><>< 59 Again, as with the diagnoses, the two sets of categories--those mentioned in two or more sessions and those mentioned only once had different characteristics. There were more treatment statements in the set of remedial statements made in two or more sessions than in the set of remedial statements made only once. Table 8: Proportion of Treatment Statements in Set of Remedial Statements Mentioned Once Mentioned in Two or More Sessions Case One 0.34 0.63 Case Two 0.40 0.63 Case Three 0.49 1.00 Case Four - 0.43 0.63 The idiosyncratic statements contained a large number of statements which were essentially diagnostic rather than remedial. That is, they were seen as strengths, weaknesses, or observations rather than as treatments. However, as discussed in Chapter Three, the remedial check- list reflects the way in which clinicians with similar training and experience wrote their remediations in previous studies. Those state- ments made in two or more sessions focused almost exclusively on treat- ment actions that needed to be taken. Across the four cases, 27 specific remedial statements were men- tioned in two or more sessions. They clustered within seven categories: 6O 1. SIGHT WORDS (identify high frequency sight words in context/ isolation; build sight vocabulary; use in language experience stories; use sight words, phrases); 2. STRUCTURAL ANALYSIS (syllables general; count number of syllables seen and heard; add prefixes/suffixes to root words; identify root words); 3. PHONETIC ANALYSIS (general; drill on short/long vowel generalizations; blend sound units into words; apply phonics to attack unknown words); 4. ORAL READING (improve rate/phrasing); 5. VISUAL (memory general; discriminate visually similar words); 6. COMPREHENSION (general; information gathering general; use context); 7. MOTIVATION (provide high interest materials; involve student in setting own purpose). TO summarize, a small number of remedial statements was agreed upon as being appropriate to a given case. These statements were almost always coded as being treatments rather than strengths, weak- nesses, or observations. In contrast, those remedial statements mentioned only once were drawn from across the entire domain and were more likely to be coded as strengths, weaknesses, or observations, rather than as treatments (Appendix P). Across all four cases, combinations of 27 remedial statements were mentioned more than once (Table 9). Impact of additional "Other/OTHER Statements" in the remedial and diagnostic checklists. Both the diagnostic and remedial checklists 61 Table 9: Agreed-Upon Remedial Statements Across Cases Cases ' Statements 2 3 ><><>< ><><><>< ><>< ><><><>< ><>< ><><><>< XX XX xxx x Sight words: identify high frequency (context) buildisight vacabilary general use in language experience use sight words/phrases Oral reading: improve rate general improve fluency Phonetic analysis: general apply phonics to attack new words drill on short/long vowel generalizations blending, general blend sound units Structural analysis: general . recognize common word parts addyprefixes/suffixes count syllables identify root words teach syllables Vision: discriminate visually similar words Comprehension: general informationjgathering demonstrate factual recall use context as aid in comprehension Motivation: general setting own_purpose provide high interest materials 62 allowed for the addition of "Other/OTHER" statements. These statements were not part of the standardized domains for diagnosis and remediation and were not included in any of the product analyses reported. The proportion of these statements to those that fit the checklist are shown in Table 10. Table 10: Proportion of "Other/OTHER" Statements in Set of Checklist Statements Statements Diagnosis Remediation Case 1 0.04 0.11 Case 2 0.01 0.07 Case 3 . 0.09 0.25 Case 4 0.06 0.15 The additional statements in the remedial checklists were examined first in order to determine if they differed in any substantive way from the standardized statements. They did not, as the following analysis indicates. l. The proportion of treatment statements to statements of strength, weakness, and observation was the same for both the additional and standardized statements. 2. The additional statements were distributed across the same categories as the standardized statements; and 63 3. All the additional statements were mentioned only once, reflecting the largely idiosyncratic nature of the remediations in general. . The additional statements in the diagnostic checklist also did not differ in any substantive way from the standardized statements. 1. The proportion of weakness statements to statements of strength and observation was the same for the additional and standardized statements; 2. The additional statements were distributed across the same categories as the standardized statements; and 3. The additional statements were all idiosyncratic. It should be noted also that the existing major categories in both checklists accommodated almost all of the additional statements. Only a very small number fell wholly outside the framework of either check- list's categories. "OTHER" statements made for the diagnostic check- list were (1) spelling, (2) handwriting, (3) writing vowel sounds, (4) writing consonant sounds, and (5) writing syllables. All were coded as a weakness. "OTHER" statements made for the remedial check- list were (1) vocabulary meaning in context, (2) write daily 109, (3) silent reading, (4) emotional, (5) finger pointing, (6) lip reading, (7) peer relationships, and (8) parent educational background. All but the last two statements were coded as treatment. (3) Cues The domains for diagnosis and remediation are all-inclusive, i.e., each domain serves all four cases. In contrast, each case has its own cue domain with the size of the domains ranging from 83 to 104 items. 64 The number of cues requested in two or more sessions as compared to the total number requested for each case was relatively high. Table 11: Proportion of Cues Mentioned in Two or More Sessions Case 1 0.79 Case 2 0.71 Case 3 0.63 Case 4 0.68 As with the diagnoses and remediations, the two sets of cue categories--those requested in two or more sessions and those requested only once--had different characteristics. The least agreed upon cue forms (mentioned only once) were those that presented information in the relatively high inference form of test booklets and audio tapes which required that the subjects make judgments about the child's problems based on an examination of the child's performance on the protocols. Table 12: Proportion of Requests for Test Booklets/Audio Tapes Requested Once Requested in Two or More Sessions Case 1 0.54 0.14 Case 2 0.50 0.27 Case 3 0.31 0.11 0.37 0.20 Case 4 65 The most agreed-upon cue forms were those that provided infor- mation in the relatively low inference form of test scores and comments (Appendix P). The subjects simply had to note the score or read the comment. The test scores and comments requested in two or more sessions dealt with: —l 0 Background (home/school); 2. Cognitive ability (Weschler, Slosson, Peabody); 3. Diagnostic tests of reading ability including oral reading, word analysis, silent reading, visual memory, etc. (Durrell, Gates, McKillop, Individual Reading Analysis); Graded word lists (Slosson Oral Reading Test); Basic sight vocabulary (Dolch); Reading achievement tests (Gates, McGinitie); Visual/auditory reports; and comma-14> Auditory discrimination (Wepman). Some clinicians collected many items of information within each category, others collected very few items. Thoroughness of collection had minimal effect on diagnostic reliability, however. The clinician who collected the fewest items of information (7'= 15) had zero diag- nostic reliability. The clinician who collected the most items (3? = 44) had a diagnostic reliability of only 0.12. The categories of cues were collected consistently across cases. However, agreement within cases on Specific diagnostic and remedial statements which characterized that case was extremely limited. For a given case, then, the interpretation of commonly selected cues 66 appeared to be highly idiosyncratic, with little agreement as to what to do with the data once it has been collected. I In summary, the most frequently requested cues were in the low inference form (i.e., the child's work itself was not available for inspection) of test scores and comments dealing with word recognition and analysis, cognitive ability, oral and silent reading, hearing, visiOn, and background. The most frequently mentioned diagnoses and remediations mirrored these cues, with the interesting exception of background information which did not seem to have any impact across sessions. It appears, then, that across sessions, common cues resulted in largely non-common diagnoses (primarily problems) and non-common remediations (primarily treatments), indicating a lack of mutually agreed upon interpretations of common information sources. (4) Remediated Diagnoses ' Not all diagnostic statements were remediated, as Table 13 illustrates. 0n the average, roughly two-thirds of the diagnostic statements made across cases were remediated. Diagnostic statements coded as weaknesses were more likely to be given remediation than those coded as either strengths or observations. This held both for remediated diagnoses mentioned once as well as for those mentioned in two or more sessions (Appendix P). As with the diagnostic and remedial statements, those diagnoses seen as requiring treatment were usually not agreed upon across two or more sessions. 67 Table 13: Proportion of Remediated Diagnoses to Total Diagnostic Statements Statements Total Number Number of Proportion of of Diagnostic Diagnoses that Diagnostic Statements Were Linked to Statements that Made a Remediation Were Remediated Case 1 74 45 0.60 Case 2 64 45 0.70 Case 3 ' 91 58 0.50 Case 4 75 51 0.66 It should be noted that, for all cases, it was a subset of the more agreed-upon diagnoses that were chosen for remediation. That is, the clinicians did not remediate idiosyncratic diagnostic statements. Rather, they remediated only those diagnostic statements that analysis showed to have been repeated in two or more sessions and which were Shown to be grouped under nine categories. Table 14: Proportion of Remediated Diagnoses Mentioned in Two or More Sessions Case 1 - 0.29 Case 2 0.37 Case 3 0.17 Case 4 0.41 68 (5) Specific Remedial/Diagnostic Associations Across the four cases, only 31 specific remedial/diagnostic associations (out of a total of 654 made) were repeated across two or more sessions. The proportion of agreed-upon associations to all associations made is shown in Table 15. Table 15: Proportion of Agreed Upon Remedial/Diagnostic Associations (Two or More Sessions) Case 1 0.02 Case 2 0.06 Case 3 0.02 Case 4 0.05 Put another way, from 94 percent to 98 percent of all specific remedial-diagnostic associations were idiosyncratic to a given session, that is, they were mentioned only once. The few associations that were made more than once were more likely to pair treatment with weak- ness than were the idiosyncratic remedial/diagnostic associations (Appendix P). Table 16 Shows the proportion of treatment/weakness associations for single sessions and for two or more sessions. An average of 17 percent of the associations made during a given session paired a single remediation with more than one diagnostic state- ment. A treatment would be paired with different weaknesses or with strengths and observations as well as weaknesses. For example, one remedial statement coded T (the student requires instruction in a systematic phonics program) was associated with three diagnostic 69 Table 16: Proportion of Treatment/Weakness Associations in Set of All Remedial/Diagnostic Associations Proportion of Treatment] Proportion of Treatment/ Weakness Associations Weakness Associations Mentioned During One Mentioned During Two Session Only or More Sessions Case 1 0.40 1.00 Case 2 0.31 0.50 Case 3 0.43 1.00 Case 4 ' 0.41 0.66 statements coded, respectively, W, S, and Obs. (The student is practically a non-speller. He can hear sounds in words. He needs instruction in word analysis.) The effect of this multiple pairing was to generate a large number of associations for each session. As was shown, virtually none of them was ever repeated in another session. The results of the analyses for proportional agreement can now be summarized. l. A small proportion of the total diagnostic and remedial state- ments mentioned for a case were repeated across two or more sessions. Most statements were idiosyncratic, that is, they were mentioned in only one session. 2. Diagnostic statements coded as "Weakness" and remedial state- ments coded as "Treatment" were more likely to be agreed upon across sessions than were diagnostic statements of strength 70 and observation and remedial statements of strength, weakness, and observation. 3. Roughly two-thirds of all diagnostic statements were remediated. However, of these, only a small percentage were agreed upon across two or more sessions. Diagnoses that were remediated were almost always coded as a weakness. 4. There was considerably more agreement on which cues should be collected than on the diagnoses and remediations which resulted from examination of these cues. 5. Associations between remediation and diagnosis mentioned two or more times, though very few in number, were more likely to pair treatment with weakness than were remediation/diagnosis associations mentioned only once. The data partially supported two hypotheses: 1. Statements of weakness were repeated more reliably across sessions than were all diagnostic categories; and 2. Statements of weakness were more often associated with treat- ments across sessions. The hypothesis dealing with remediated diagnostic statements--that they are repeated more reliably across sessions than are all diagnostic statements--was not supported by the data. While remediated diagnoses were more likely to be categorized as a weakness, agreement as to specifically which ones would be remediated was lower than agreement for all diagnostic categories. As the following summary figure and tables show, most statements were made only once. Even setting the most generous standard for 71 reliability (statement is mentioned in two of the six sessions) resulted in agreement only on one-third or fewer of the total state- ments made. When the standard for reliability is changed to reflect those statements made in three_or more of the sessions, agreement plunges precipitously (Tables 18, 19). Idiosyncratic statements, therefore, resulted in products that were much longer than was necessary for reliable judgments. A number of possible sources for the length of individual products was hypothesized: (l) diagnostic statements of strength and obser- vation tended to be more idiosyncratic than were statements of weakness; (2) remedial statements of strength, weakness and observation tended to be more idiosyncratic than were statements of treatment; (3) redun- dant associations, and associations which paired diagnostic statements other than "weakness" with remedial statements other than "treatment" combined to cause an inflated number of associations; (4) highly idiosyncratic and inconsistent interpretations of background infor- mation, test booklets, and audio tapes acted as distractors to the main task of isolating weaknesses and specifying treatments. Use of these materials was shown to be strongly related to (1) both increased use of strength and observation statements within sessions and (2) decreased agreement of statements across sessions. The data on proportional agreement are surrmarized in Figure 1 and Tables 17, 18, and 19. PROPORTION OF STRTEHENTS IN EHCH FREQUENCY ‘00 cp.00 0.90 .l I' an) 0,70 0:60 0:50 law ago PROPORTIONHL HGREEHENT: 72 DIHGNOSIS = SOUHRE REMEDIHTION = X ‘ CUE = TRIHNGLE RENE IHTED DIHGNOSES = Z REMEIIHTION/DIHGNOSIS HSSOCIRTIONS = OCTHGON d1? ( 8 Nos mso ? mos PROPORTIONHL HGREEHENT HLL PRODUCTS ‘J J 03o I I' 1.00 73 mo.o pm.o 05.0 m~.o mm.o mcowmmmm wees go ozp cm>o umuomnme mpcmsmaoum —ouou we cowucoaoco mm.o mo.o _ om.o mm.o mo.o :ovmmmm mco as_czc mums mucmswuoum page» mo covucogoca mcovpovuomm< memocmovo uppmocmo_o\povumswm umumpumsmm mono coeuo_coemm memocaowa muoscoeo mmocu< mucwsoumpm caesiummcm< new cvuocuczmopua Lo mcorpcoaoca "up mpnoh 74 oc.o oo.o _oo.o eoo.o ec.o mm.o acoLDAPoomm< upamocmwwo pm 25:51 oo.o Fo.o mo.o Ao.o F~.o mo.o mamocmaeo eooaeeoEom No.o mo.o e_.o mN.o F~.o m~.o moao moo.o moo.o no.0 oo.o F~.o mo.o coeoaeaoeom moo.o .o.o no.0 mo.o m~.o mo.o memocaaeo Aoo._v Aem.ov Aao.ov om.ov Amm.ov 5A..ov mcopmmmm o mcovmmmm m meowmmmm e meowmmmm m meowmmmm.~ covmmmm p muuauoga mmoeu< :owmmmm :oou Low mucwsmumum we cowucoaoco w—nwp mumsssm uw— mpnmh 75 emo o o p m AN «No meoeoaeo0mm< owpmocmowo 3:85”. mm. o N m mp Fe mm_ memocmaea eooaeeosom mmN m om Am mm mm as moso cam _. _ A . 0_ mm Am_ coeoaeeoan com A e a mN so ea. memocmaeo Aoo._v Aom.ov Aeo.ov Acm.ov Amm.ov AAF.OV Pouch meomemm o mcopmmmm m «copmmmm e meowmmmm m mcovmmwm N :oPmmmm P muoauogo mmocu< cowmmmm zoom cow mucwsmpmum go emnsaz Pouch “mp mpno» 76 Characteristics of Unassociated Remedial and Diagnostic Statements The results of the pilot Study (Chapter Two) indicated that clinicians did not always make associations between diagnostic and remedial statements. The inconsistency was hypothesized to arise from two sources: 1. If a problem in the diagnosis was not associated with any remediation, some possible explanations were that the clinician forgot to include a treatment for the problem and/or the clinician did not know of any appropriate treatment and/or treatment was not called for. 2. If a treatment in the remediation was not associated with any problem in the diagnosis, some possible explanations were that the clinician never articulated the problem in the diagnosis and/or always used this treatment irrespective of diagnosis. The clinicians in this study had unassociated diagnostic and remedial statements. Across all cases, unmatched diagnostic statements were much more likely to be coded as strengths and observations rather than as weaknesses. This complemented earlier findings that it was statements of weaknesses that were repeated more frequently across sessions, were remediated more frequently, and were associated more frequently with treatments than were either statements of strengths or observations. During the debriefing at the end of session three, the subjects were asked about the diagnostic statements which were not associated 77 Table 20: Unmatched Diagnostic Statements Number of Unmatched Number of Unmatched Strength/Observation Weakness Statements Statements Case 1 39 13 Case 2 ' 32 a Case 3 40 37 Case 4 35 6 with a remediation. They were given a guide sheet to help them reflect on the question as follows: This diagnostic element is not associated with a remediation because: a. The diagnostic element cannot be remediated; b. The diagnostic element does not require remediations; c. I don't know a reason for this diagnostic element; d. I forgot to include the remediation in my write up; e. other. Across all the cases, strengths were not associated with remedi- ations almost entirely because of reasons stated in statements a and b above: the diagnostic statement couldn't be remediated or didn't need one. ~Only twice was an unassociated strength attributed to d above, that is, the clinician had intended to associate it with a remediation but forgot to write the remediation in the initial plan. Statements of observations also were not remediated because of reasons stated in statements b and d above. Only once was an 78 unassociated observation attributed to c above: that is, a remediation was seen as necessary but the subject did not know of one. Weaknesses were not associated with remediations almost exclusively because of reasons stated in c and d above: the subject knew no remediation or forgot to include the remediation in the write-up. In two cases the subject felt that the diagnostic element could not be remediated (a above). In summary, somediagnostic statements of strength, weakness, and Observation in this study were always left unassociated with remedi- ation. Statements of strength and observation were unassociated primarily because they could not be remediated or did not require remediation. Unassociated statements of weakness were left unmatched primarily because the subject meant to provide a remediation but forgot to write it down, and secondarily because the subject simply did not know a remediation for the diagnostic element. Turning to the characteristics of unassociated remedial statements, the data showed that treatment statements made up by far the major portion of all remediations. Not surprisingly, then, almost all unassociated remediations were statements of treatment. Additionally, the design of the study was such that diagnoses had to be matched to each remediation in turn, making it less likely that any given remedi- ation would go unnoticed. Total Unmatched Remedial Statements (Out of 260 Statements)‘ Treatment = 26 Strength = 2 Weakness = 3 Observation 79 During the debriefing at the end of session three, the subjects were asked to consider why these remedial statements were not associated with a diagnostic statement. They were given a guide sheet as follows: a. This remediation should always be included no matter what the specific diagnosis may be; b. The remediation is always appropriate for this type of student; c. I forgot to include the diagnostic element in the write-up; d. Other.m Treatments were not associated with diagnoses for only two reported reasons: (a) the remediation should always be included and (b) it is always appropriate. In these relatively few instances of unassociated treatments, then, a diagnosis would actually be un- ‘necessary. Irrespective of diagnostic findings, these treatments were seen as always being necessary and appropriate for children with reading difficulty. The two sources hypothesized to account for inconsistency in associating initial remediation plan with diagnosis (Chapter One) were supported by the data. 1. Statements in the diagnosis were not associated with any treatment because a. they could not be remediated, b. they did not require remediation, c. the clinician meant to provide a remediation but forgot to write it down, and 80 d. the clinician knew no remediation for the diagnostic statement. 2. Statements in the remediation were not associated with any diagnostic statements because the clinician believed that the remediation should always be included. Characteristics of Associated Remedial and Diagnostic Statements Most remedial statements were associated with diagnostic state- ments. During the debriefing at the end of session three, the subjects were asked (1) why they chose the particular remediation they had written down, and (2) what they saw as the association between that remediation and the diagnoses which they associated with it. They used a guide sheet as follows: 1. Why did you chooSe this remediation? a. I am most familiar with it b. It is usually available c. I usually have success with it d. Other 2. What is the association between remediation and the diagnosis? a. The remediation is effective for this weakness b. The remediation is effective for this strength c. The remediation is effective for observations of nature d. Other In answering question one ("Why did you choose this remediation?") the subjects indicated that for those remediations coded as treatments, they chose the remediation in question primarily because it was usually 81 available and usually effective, and secondarily because of familiarity. In other words, the teachers may have been familiar with a wide range of remedial materials and techniques, but they reported usage on the basis of availability and successful prior use. For non-treatment remediations (strength, weakness, or obser- vation) the subjects had to use the "other" category to explain their choice. Typical comments concerning a non-treatment remediation were, "It's not a remediation, it's an observation (or strength, or weak- ness)." Some variations were, "I'm not suggesting a treatment, I'm commenting on a weakness;" or, "I'm just noting scores (or comments, or background data, etc.)." It appeared from these limited debriefing data that the subjects saw a remedial plan as having two categories of statements: remedi- ations (i.e., treatments) and other statements whose purpose seemed to be to reiterate statements made in the diagnosis. These non-treatment statements, as had been demonstrated earlier, were almost always idiosyncratic in nature and served to isolate each session's product from all the others. In answering question two ("What is the association between remedi- ation and the diagnosis?") on the guide sheet, the subjects indicated that certain treatments were chosen because they were believed to be effective for certain weaknesses. Non-treatment statements fell into. the "other" category as was the case for the first question. Commonality For all statements mentioned in two or more sessions, a given clinician's product for a given case included, on the average, roughly 82 half of the statements/associations mentioned two or more times for the same caSe by the other clinicians (Table 21). There was Considerable variation among clinicians for a given product, with some sharing only a few statements or associations with group and others sharing more. The key to putting conmonality outcomes into proper perspective was the fact that, with the exception of cues, idiosyncratic statements (statements made during one session) for all products far outnumbered statements mentioned two or more times. Thus, similarity of perfor- mance, while appearing to be relatively high, was calculated on the basis of a restricted set of statements (those mentioned in two or more sessions). What was obscured was the fact that at most only one-third of a given product (excluding cues) was even eligible for inclusion in the comonality measurement. The conmonality statistic indicated once again that only by aggregating products across sessions could the outlines of even a meager consensus on each case be revealed. The individual session products were not only largely idiosyncratic in content, they were also idiosyncratic in terms of format. There was no standard method used to present the information. Direct verbal com- parison was thus virtually impossible. Intercorrelations According to the data shown in Table 22, the liklihood of any two_clinicians agreeing on anything about a given case (excluding cues) decreased across products. At best, mean agreement was 14 percent of the combined statements (see Diagnosis column). At worst, mean agree- ment was one percent (see Remedial/Diagnostic column). The effect of 83 om.o Ne.o so.o om.o mm.o caoz cameo op.o H mm mo.o u on -.o u om mo.o N am _P.o u am e ammo me.o m..o mo.o oo.o mo.o Am.o u am e~.o N am o_.c N am mN.o N am m~.o N am m omau mm.o Ne.o me.o eo.o mm.o o~.o N am om.o u om AN.o u aw F_.o u am m~.o u om N omau mm.o mm.o mo.o mm.o mm.o mm.o N am AN.o H mm o~.o N am ao.o N am m_.o u am A omau om.o mm.o _A.o om.o me.o mcovuomoomm< memocmovo mmau zovuovvamm memoamowo oeumocoa_o\_o_eosom oooaeeoeom mpuzvoca pp< ace mmcoum zuwpmcossou com: ”pm mpnmh 84 idiosyncratic statements (which was somewhat controlled for when products were aggregated across sessions) was highlighted in the results of the intercorrelations. A partial intercorrelation grid for the diagnoses of clinicians A and B is presented in Table 23. The clinician's agreed on seven percent of their combined statements. Table 24 shows that the liklihood of clinicians agreeing with themselves on statements made about a case and its replicate (excluding cues) was only slightly more favorable than for two separate clinicians. At best, mean agreement was 23 percent of their combined statements (see Diagnostic column and Remedial column). At worst mean agreement, again, was one percent (Remedial/Diagnostic column). That is, they made wholly dissimilar statements during the two sessions. The impact of the idiosyncratic statements virtually overwhelmed the few points of agreement that did exist.' A partial intracorrelation grid for the diagnoses of clinician A is presented in Table 25. The clinician agreed with herself on 20 (13) percent of the combined statements made in sessions one and three. The sample matrices for inter- and intra-correlations illustrate sharply the very limited agreement within and between clinicians on a case. Individual differences in performance did exist, however, (Appendix P) with the widest variation shown for cues (0.07 to 0.77). However, wide individual differences in cues did not lead to equally wide ranges in performance on other products. Using the Porter statistic, the variability of diagnostic agreement and remedial agree- ment ranged from 0.00 to 0.29, of agreement on remediated diagnoses from 0.00 to 0.22, and on specific remedial diagnostic associations 85 .o'umruoum conga; one mcvmz copmpsopmo mam: mononucmcoa :F mamaszzs ~o.ov mp.- Awo.v mp. «mm.m eu. Aop.v up. App.v mp. com: usage uo.uom mo.uom wo.uom mp.uom o~.uom mo.uam o_.uom mo.uom mo.uom e mmmu :3 2.- a: 8. a: 8. :3 2. :3 8. mo.uom mo.uom mo.uam op.uom mp.uom mo.nom op.uom oo.nam mo.uom m ammo Ao.ov mp.- Aeo.v co. Amm.v mm. Ano.v op. Amo.v Pp. mo.uom no.uom mp.uom 5F.uom mm.nam wo.uom mo.uam no.uom pp.uom N mmou 8.8 at. a: 2. E; 8.. a: 2. a: 2. Pp.uom so.uom e_.uom mp.nom mm.nom mo.uom op.uom oo.uom op.uam P mmou :3 2.- 83 2. a: a. so; 3. 1m: 2. m:o*»m_oomm< memocmovo mmzu cowuovumsmm memocmowa owamocmova pneumsmm coworvmsmm cowpmpwccoo covopcvpuucmuca com: "Nu mpnmp 86 Table 23: Partial Intercorrelation Matrix for Diagnosis Case 3, Clinicians A and 8 Phi = .07 Porter = .07 STATEMENT ++ STATEMENT +- Present in Both Present in Clinician A, Clinicians A and 8 Absent in Clinician 8 Hearing: Acuity (W) Vision: General (Obs) Speech articulation (W) Vision: Acuity (W) 'Intellectual potential (S) Hearing: General (Obs) Affective: General (Obs) Emotional adjust (W) Parent education background (Obs) Parent attitude toward school (W) Parent-child relation (Obs) Sibling relationship (5) Comprehension: General (W) Comprehension: Oral (W) Comprehension: Silent (W) Comprehension: Vocabulary (W) Relationship with peers (W) Use of blends (W, Obs) Hearing acuity (Obs) STATEMENT -+ - STATEMENT -- Absent in Clinician A, Absent in Both Present in Clinician B Clinicians A and 8 Speech articulation (Obs) 263 Domain statements excluded Motor coordination (W, Obs) Attitude toward reading instructional (W) Attitude toward reading independent (W) Ability to apply reading skills (W) Oral reading: General (Obs) Oral reading: Rate (W) Oral reading: Self correction (W) Oral reading: Phrasing (W) Oral reading: Intonation (W) Silent reading: General (W) Word analysis: General (W) Phonetic analysis: General (Obs) Use of initial consonant sounds (S) Use of syllables (W) Comprehension: Listening (S) Relationship with peers (W) Comprehension: Oral (S) Word recognition: General (S) 87 .uPamvumum season «so anew: woumpaopmu mew: mommspcmgma cw mgmpsszs «No.v op.- Aep.v NN. Ame.v Pm. AON.N NN. flee.v NN. caoz ecaao Aoo.v se.- ANP.V oN. ANN.V NNJ ANN q mm. asp.v AN. 4 ammo Apo.v m,.- as..v NF. AeN.V NN. NNF.N NP. ANP.V NN. m omao «Neae FP.- Amo.v op. Ape.q co. ~NN.v mm. amc.v NP. N ammo decoemo.- «Np.q.om. Ame.v om. “ON.V em. .aON.qNN. . omau mcopuovuomm< memocmowa mono covumwumsom memocmmpo oeooocmaea PapeoEom ooooeeosoa cowpopmeeou cm.u_:wpu-mcucm com: ”SN space 88 Table 25: Partial Intracorrelation Matrix for Diagnosis Case 3, Clinician A, Time One, Clinician A, Time Two Phi = .20 Porter = .13 STATEMENT ++ STATEMENT +- Present at Time One and Time Two Motor coordination (W) Intellectual potential: General (S) Oral reading: Oral reading: Phrasing (W) Intonation (W) STATEMENT -+ Absent at Time One, Present at Time Two Motor coordination (Obs) ~ Hearing acuity (W) Speech articulation (W, Obs) Attitude toward reading: Independent (W) Attitude toward reading: Instructional (W) Relationship to peers (W) Ability to apply reading skills (W) Oral reading: General (Obs) Oral reading: Rate (W) Oral reading: Self-correction (W) Silent reading: General (W) Word analysis: General (S) Phonetic analysis: General (Obs) Use of initial consonant sounds (S) Use of syllables (S) Word recognition: General (5) Comprehension: Oral (S) Comprehension: Listening (S) Present at Time One, Absent at Time Two Attitude toward reading: Independent (obs) Motivation for readin (Obs) Emotional adjustment iW) Substitutions contextually acceptable (W) Silent reading comprehension (W) Word analysis (W) Phonetic analysis (W) Comprehension vocabulary (Obs) STATEMENT -- Absent Both at Time One and Time Two 272 Domain statements excluded 89 from 0.00 to 0.08. Clinicians who were highly reliable in cue collection, then, were not able to translate that into high reliability on other products. At best, the clinicians agreed (two or more sessions) on 29 percent of the total statements for the same case. At worst, on none of the statements. Therefore, the low mean inter- correlations shown in Tables 22 and 24 were not an artifact of some very high agreement levels being masked by low ones. Even at its best. intra- and inter-clinician reliability was markedly inadequate. Comparison of Michigan and Illinois Subjects on All Products The design of the study allowed for the-comparison of the two groups of subjects: those who received their training entirely in Michigan and those who received their training entirely in Illinois. Five t-tests for differences between mean performance on a case and its replicate (intra-clinician correlation) were computed to the test the hypothesis that there was no difference between the groups in their performance on the five products. Using an alpha of .05, none of the five tests reached signifi- cance: ' Diagnosis = 0.84 Remediation = 0.18 Cues = 0.07 Remediated Diagnoses = 0.38 Remediation/Diagnosis Associations = 0.31 There was no statistically significant difference in performance between subjects trained in Michigan and those trained in Illinois. However, the results were not uniform. The performance on cues 90 approached significance. The two groups appeared to have been trained differentially with respect to sensitivity to certain cue forms: lower inference forms (test scores and comments) and higher inference forms (test protocols and audio recordings). The performance on diagnostic statements, conversely, indicated that there was almost no difference whatever between the two groups on that dimension. The Michigan subjects requested a total of 231 cues, 99 of which (0.42) were in the form of booklets and tapes. The Illinois subjects requested a total of 419 cues, only 38 of which (0.09) were in the form of booklets, and tapes. The two groups appeared to have been trained differently with respect to what was considered the most fruitful way to gather diagnostic information, but the differing methods did not translate into differing diagnostic performance. (Both groups examined Background Information to a similar extent: Michigan - 32 requests; Illinois = 44 requests.) As had already been demonstrated, the most agreed upon cues (two or more sessions for a given case) were those that presented infor- mation as test scores and comments. This implied that, for'the Michigan group, the examination of test booklets and audio tapes resulted in a high degree of idiosyncratic interpretation. ‘For the Illinois group, the implication was that a large volume of information dealing with scores and comments was collected, also resulting in idiosyncratic interpretation. Even though the Illinois subjects had a substantially higher level of intra-clinician agreement on which cues should be collected 91 (Illinois: 7'= 0.56, Michigan: 7'= 0.27) they still were not able to translate that agreement to diagnostic performance. Correlational Data (1) By Individual The correlations were computed using mean intra-clinician figures compiled for each of the clinicians on each of the five products. The square of the value for the correlation (r2), not the correlation coefficient per se (stitive or negative) gives us information as to how well one variable can be predicted from knowledge of the other. Following are the corelational values computed for various sets of products. 2 = 0.07). We have seen that (l) Cues and diagnosis(r = 0.27, r diagnoses were largely idiosyncratic, that large numbers of cues were collected, and that cues were interpreted in a variety of ways. There- fore, very little of a diagnosis could be predicted on the basis of knowledge of cues collected. 2 = 0.01). Very simply, (2)0iagnosis and remediation (r = 0.10,;r for a given clinician, on the average, the remediation could not be predicted on the basis of knowledge of the diagnosis. This refuted a major conclusion of the pilot study; namely that diagnosis and remedi- ation at the clinician level were moderately correlated. (3) Cues and remediation (r = 0.25, r2 = 0.06. As with diagnosis, very little of the remediation could be predicted on the basis of knowledge of cues collected. 92 2 - 0.002). (4) Remediated diagnosis and remediation (r = 0.25, r Knowledge of the diagnostic statements that were designated as requiring remediation provided no predictive power as to the specific remediations that would be suggested when performance was averaged across two sessions for the same case. (2) By GrOup The correlations were computed using mean commonality figures compiled for each case (Table 22). Commonality scores were used to approximate group performance on all products in order to see whether - group performance was more predictable than individual performance. Predictive capability was found to be low even at this aggregate level. 2 1. Cues and diagnosis: (r = 0.14, r = 0.02) 2. Diagnosis and remediation: (r = -0.58, r2 = 0.33) 3. Cues and remediation: (r = -0.18, r2 = 0.03) 4. Remediated diagnosis and remediation: (r = -0.43, r2 - 0.18) Again, knowledge of cues collected provided no grounds for pre- dicting either diagnosis or remediation. However, some of the variation in agreed upon remediation could be accounted for by knowledge of the diagnosis and the remediated diagnosis. Clinician Self-Reports At the close of the third session, each clinician was asked to self-report on how she normally went about gathering information, writing a diagnosis, and relating the initial remediation plan to the diagnosis. They reported collecting many of the same types of infor- mation, e.g., on intelligence scores, comprehension, word attack and 93 recognition skills, etc. Further, they shared very similar procedures for writing their diagnosis. They said that they organized their diagnosis (1) in the order in which they collected information from the children, or (2) by grouping their findings by strengths and weak- nesses, or (3) by listing findings according to areas of need, together with supporting information. Finally, they held similar views as to the remediation. They asserted that (1) there should be a direct one- to-one relationship between the two, although there will be some things that cannot be remediated; (2) weaknesses should be treated by using strengths; and (3) weaknesses should be prescribed for directly. The subjects' responses to all three questions were prompt and very similar, suggesting that they had internalized a standard rhetoric for describing the task of diagnosis and remediation. Following is an example of clinician C's stated plan. The mean intracorrelation for this clinician's diagnosis was .18 which was Slightly above the group mean. Eighteen percent of the statements made for the case at time one were repeated again at time two. #1: What guides you in determining what information to collect and then what guides you in writing up the diagnosis? First I want to get a general impression of how he's doing, first in comprehension. If he's doing poorly in compre- hension, I try to find some reasons for it. First I look at word recognition and then I look at phonics. Then I try to group the test scores that have to do with word recognition and those that have to do with phonics. If two phonics scores didn't seem to match I would look at the test to see what exactly the kid had to do, what the task was. If one score was much lower than another, what was it about the first test that was so much harder for the kid? 94 If the comprehension is low I usually look for an 1.0. score or a listening score to see if they have the potential for higher comprehension. If a kid had a high 1.0. and they were doing poorly in reading, then I would look at hearing, talk to the parents, ask for physical stuff. Right now I don't go into a kid's emotional state, for example, unless its very obvious, like the kid is aggressive or very shy. I just try ‘ to deal with the skills themselves. In the write-up I listed an observation and then the scores that supported it. First I look at comprehension, then vocabulary, then phonics. But if there was something in the scores that didn't seem pertinent to the diagnosis then I just left it out. Usually I try to identify just weaknesses. An observation that he did well in arithmetic doesn't make a big difference so I just left it out of the write-up. #2: What do you see as being the relationship between a diagnosis and initial remediation plan? It should be one to one. If you notice that something is a weakness then you should write steps to overcome this weak- ness. If there's a strength in reading you should try and use that strength and maybe apply some of them. You wouldn't give a kid exercises to practice something he knows, though. This kid seemed to be better in sight words than in comprehension but he wasn't very good on suffixes so I figured that if he had sight words that contained some suffixes you could say, "look at the suffix." So the remediation should be based almost completely on the diagnosis. But once you start the remediation if it doesn't work then you have to revise it. Wogking with the kid you find more than what the test scores te 1 you. If clinician C followed her plan, she would collect information in the following order: comprehension, word recognition, phonics, 1.0., and physical information. According to the plan, the diagnostic write- ups would present the findings in the order indicated. Finally, there would be a one-to-one relationship between weaknesses stated and remediations offered. The actual performance of the clinician was at variance with the stated plan. With respect to cue collection, information was collected on comprehension, word recognition, phonics, 1.0., and phyiscal 95 factors. However, the order was different for each of the three cases. In addition, considerable additional information not mentioned in the plan was collected including: background information, arithmetic and spelling samples, informal oral reading, individual achievement tests, and writing samples. There was no discernable pattern of collection for this additional information. Across the eight clinicians, two overall observations were made regarding cue collection. 1. The order in which cues were actually collected across sessions did not follow the order given in the stated plan. Further, whatever order was used, it was never fOllowed a second time, either for the middle (distractor) case or for the replicate case. 2. A substantial amount of information was collected that was never mentioned in the stated plan at all. Six of the eight clinicians requested this additional information to the extent that it equalled or exceeded the information that was collected according to plan. While it was possible that this information might have been requested merely because it was available, the data warrant the inter- pretation that the clinicians (1) had no ready framework within which to fit the new data, and (2) had no commonly shared mode of inter- preting what they did request. In sum, the data indicated that the cues lead the clinicians, not the other way around. Without an explicit heuristic to guide data collection, each clinician essentially was at the mercy of whatever 96 information she chose to collect. The resulting information load had to be simplified and made manageable, but again, in the absence of an agreed upon model of consistent data collection and interpretation, that process became idiosyncratic and therefore unpredictable across sessions. With respect to writing the diagnosis, the statements in the sample were not ordered in any consistent way, whether comparing the diagnosis for the case and its replicate or comparing either diagnosis with the plan. Following are the first paragraphs from clinician C's diagnostic write-ups for the case and its replicate (sessions one and three). Clinician C did follow that stated procedure of supporting observations with scores. Session 1. "Poor auditory acuity may be an inhibiting factor in his reading. His test in auditory acuity was below normal. He missed 5 of the 30 pairs on the Wepman. He scored only 2.5 grade level on the 'Hearing Sounds in Words' test on the Durrell." Session 3. "His verbal skills seem to be below average for his age. His verbal WISC score was 86, and his listening comprehension was 4.5. He does better in areas not requiring reading. His perfor- mance 1.0. was 106." Across the eight clinicians, three overall observations were made regarding diagnostic write-ups. 1. In no instance did the diagnostic statements follow the order of cues collected. 97 2. Whatever the order in which diagnostic statements were presented, no clinician repeated that order a second time. 3. In those cases where strengths and weaknesses were loosly grouped and supporting information was provided, the pattern was never the same for all three sessions for a clinician, and the order of presentation always varied. Finally, clinician C's remedial suggestions were not presented in the same order as diagnostic statements. Further, in addition to one- to-one associations, there were often multiple diagnostic statements associated with a given remediation. associations follow. Remediation Session 1 To increase comprehenSion level use exercises such as the cloze or modified cloze. (T) Session 3 Ask questions after each selection to train him to attend to meaning. (T) A sample of some of clinician C's Diagnosis His comprehension scores are lower than his sight words. (W) His comprehension was only 71 percent. (W) In most cases his errors resulted in loss of meaning. (W) His Silent reading scores are very low. (W) His rate is low. (W) 98 Across the eight clinicians, three overall observations were made regarding the associations made between remediation and diagnosis. 1. In no instance did the remedial statements follow the order of presentation of the diagnosis. Typically, remedial state- ment number one, for example, might be paired with diagnostic statement numbers 12, 13, and 24. There was no evidence of systematic use of weaknesses being treated by using strengths. NO clinician had a one-to-one match between diagnosis and remediation. There were always unmatched diagnostic state- ments. Comments About the Sessions The subjects' responses were highly positive. Almost all clinicians characterized the sessions as being both enjoyable and educational. Several hoped that the study might have an impact on the design of training of diagnosticians. Others felt that the sessions had sharpened their understanding of what they were doing. CHAPTER FIVE DISCUSSION The aSsumption in the literature that diagnosis and remediation are, or ought to be, routinely and reliably associated is not remotely substantiated by any of the findings of this study except at the case level. Even at that aggregate level, the association is only modest. Across SIMCASES clinicians examined information about background and cognitive ability and the results of performance on diagnostic tests, graded word lists, reading achievement tests, basic sight vocabulary and visual/auditory tests. Thus they all began with essen- tially the same categories of information for all the cases (although the order of collection was entirely unpredictable). From this common information base they went on to generate diagnoses and remediations. Only one-quarter to one-third of the statements comprising these two products were mentioned in two or more sessions. The proportion of agreed-upon statements fell precipitously when the standard for examining reliability changed to agreement in 30522.0? more sessions. As the data reported has shown, there was little association between diagnosis and remediation. A telling example could be seen in Case 3. There were 17 diagnostic statements which were agreed upon as characterizing the case but only six remediations were agreed upon as constituting an appropriate action plan. Two of those were unrelated to any of the agreed-upon diagnoses (Table 6, Table 9). 99 100 Additionally, with the exception of Case 1, the group diagnoses . were at variance with the "ideal" diagnosis (Appendix 0) compiled by the senior reading clinician: .omission of weaknesses in fluent texting and comprehension in Case 2; omission of weaknesses in sight vocabulary and phonetic analysis in Case 3; and omission of weaknesses in multi- syllabic analysis, fluent texting, and auditory problems in Case 4. Thus, the clinicians may have agreed as a group that certain problems characterized a case but, as a group, they may have been largely incorrect in their judgments. Conversely, they may have compiled a correct group diagnosis and the judgments of the senior clinician may be in error. The question of the validity of the diagnoses (as distinguished from their reliability--one can be reliably wrong) was beyond the scope of this investigation. However, it is clearly a major unresolved issue and is discussed further in Chapter Six. What the data in this study unambiguously demonstrated was that the clinicians studied were not reliable diagnosticians. They believed they were acting consistently but in fact collected a great deal of information in unpredictable fashion, most of which was interpreted idiosyncratically. Why did the clinicians in this study behave as they did? Speculation #1: task and external validity contributed to low reliability. The subjects' performance might have been an artifact of the experimental setting, including the use of SIMCASES. That is, the task itself was crafted so that while the clinicians could perform comfortably as required, they weren't doing what they actually do on 101 the job. Perhaps a naturalistic setting using real children would have yielded different results. However, any unreliability demon- strated in a naturalistic setting could be attributed to the differing A behaviors of the children themselves. Both settings contribute to problems of task validity. Nevertheless, a major assumption in our observation studies as well as in training and research studies in medicine, is that important problem-solving behaviors of clinicians can be elicited through simulated cases. "At least some of the important cognitive skills are elicited by sets of data gathered about real cases with real problems and stored for presentation as simulated cases" (Vinsonhaler, 1979). The question of external validity, or generalizability to a larger population, is complicated by the small sample size. However, a major trade-off for small sample size is the power of replication. As discussed earlier, a number of studies of experienced reading specialists conducted under the same conditions as the present study and using the same SIMCASES reported diagnostic results almost identi- cal to those in this study (Weinshank, 1980). This is the first study to investigate remedial reliability and that of remedial/diagnostic associations, but there is no reason, in principle, to think that these results would not be replicated with another sample. A second issue in external validity is the nature of the sample itself. Was this a representative sample of experienced, credentialled reading specialists or did chance dictate selection of an unrepre- sentatively incompetent group? These subjects met or exceeded the criteria set by the study. They were all employed in good standing 102 in five different school districts in two states. They were involved in professional reading organizations and/or had been recommended by their academic mentors as being highly competent students. If any- thing, one might argue that they were an unrepresentatively able group, thus making the results all the more disquieting. Speculation #2: training_influenced performance. Contemporary teacher training programs are committed to a position that calls for attending to individual differences ("meeting individual needs"). This view is not compatible with the proposition that a small number of diagnostic and remedial categories account for a large proportion of the problems exhibitied by deficient readers of a given age and grade range. Training that embodies the individual differences paradigm has no need of predictive capability. By definition, if everyone is different in significant ways that must be uncovered, student A's performance cannot and should not have any relationship to student B's performance. If each student's reading problem is approached from the standpoint that the problem is different in signifi- cant ways from everyone else's, those differences must somehow be divined from the cues. Which cues are the key ones? These is no evidence in this study that the training process has addressed itself to this question, as evidenced by the almost complete lack of diag- nostic and remedial agreement between clinicians and by each clinician with herself on the same case. If the individual differences hypothesis is correct, then the clinicians in this study are performing precisely as they have been trained. Making between 20 and 30 diagnostic statements and between 103 25 and 35 remedial statements per session (as almost all did) would surely lead a clinician to believe that individual needs had been taken fully into consideration. When coupled with unsystematic cue collection, the results of any cross-session comparison would be almost guaranteed to show no association, even when the case being examined is the replicate of an earlier case. Speculation #3: the clinical setting itself does not provide a context for the specialist's learning from experience. Diagnostic judgments are reached and remedial suggestions made, but there is no way to get feedback about the consequences of recommended actions. The clinician simply does what the profession dictates with no oppor- tunity to confirm or disconfirm hypotheses about problems and their treatment. Speculation #4: it is in the nature of the profession to mystify thegprocess so that it merits the respect it claims for itself. Perhaps the painful reality is that most diagnoses are embarassingly inflated and clinicians know that such diagnoses ought not to be taken seriously when it comes to remediation. Similarly, clinicians may understand all too well the limited range of action alternatives open to them but professional mystique demands that they once again engage in clinical overkill. Speculation #5: the clinicians had a plan but they didn't have a satisfactory method. According to Newell and Simon (l972), "A method is simply a plan that you use twice" (p. 835). The plans they believed they were following to organize their behavior were never actually 104 carried out twice in the same way; there was no way to predict behavior from case to case. The method actually used may be the answer to the question, "How can you make a straightforward instruc- tional problem look very complicated?" Answer: disguise the fact that, in general, a very limited number of diagnostic and remedial judgments reliably characterize most cases of reading difficulty. Produce a verbal jungle of strengths, observations, and weaknesses derived from a needlessly exhaustive examination of case information collected unsystematically. This procedure is guaranteed to take quite a bit of time, look substantive, and impart a feeling of profes- sionalism. Speculation #6: the clinicians have not mapped onto the task any model of the process of acquisition and mastery of readingbehaviors. Lacking such a model, their search for and interpretation of infor- mation is unordered and the resulting instructional strategies are unsequenced. None of these speculations is flattering to the field of remedial reading. But the results of this study are such that unless one were to assert that the subjects were "lemons" and the experimental con- ditions spurious (charges that have been examined and refuted), the near-chaotic state of diagnosis and remedication cannot go unchallenged. CHAPTER SIX SUMMARY AND RECOMMENDATIONS Introduction The purpose of this study was to establish the nature and extent of the relationship between diagnosis and remediation in reading. The study empirically investigated the clinical problem solving behavior of eight reading specialists as they (l) diagnosed simulated cases of reading difficulty, (2) prepared an initial remediation plan, and (3) associated remedial statements with diagnostic statements. Summary of Procedures The eight subjects in the study were practicing, experienced, credentialled reading specialists. Four received their training entirely in Michigan and four entirely in Illinois. Each subject participated in three sessions in a controlled setting, each session lasting from two and one-half to three hours. Subjects were randomly assigned to simulated cases (SIMCASES) of reading difficulty. Each simulated case was based on a real child who was served by the Reading Clinic at Michigan State University. In a total of 24 sessions they (1) wrote a diagnosis for a SIMCASE of reading difficulty; (2) wrote an initial treatment plan; (3) trans- ferred diagnostic statements to a standardized diagnostic checklist and remedial statements to a standardized remedial checklist; 105 106 (4) made associations between remedial and diagnostic statements; (5) reported on reasons for making the associations; (6) reported on reasons fOr the presence of unassociated diagnostic elements (if any) and unassociated remedial elements (if any); and (7) answered three free-response questions. The products that resulted from these procedures were: l. Diagnostic statements; Remedial statements; Cues; Remediated diagnoses; and 01th Specific remedial/diagnostic associations. Summary of Results The products that resulted from each session were analyzed to determine (l) the extent of agreement among and between clinicians (proportional agreement, commonality, intercorrelations; (2) strength of correlations between prodcuts; and (3) extent of differences in performance between Illinois-trained and Michigan—trained subjects. Proportional Agreement A very small number of statements/associations accounted for agreement across two or more sessions for a given case. The bulk of each product was composed of idiosyncratic statements, i.e., statements made in only one session. Both redundant diagnostic and remediated diagnostic statements were more likely to be characterized as a weakness than as a strength or observation. 107 Redundant remedial statements were more likely to be characterized as a treatment than as a strength, weakness, or observation. Redundant associations of remedial and diagnostic statements were more likely to pair treatment with weakness than with any other category. Across all cases, a limited number of categories of diagnostic and remedial statements accounted for all statements made two or more times. There was substantially higher agreement for cues requested across sessions than for the other products. Cues requested two or more times were much more likely to be in the lower inference form of test scores and comments than in the higher inference form of test booklets and audio tapes. Examination of common cues leads neither to common diagnoses, comnon remediation, nor common associations between remedi- ation and diagnosis. Commonality The results of comparing an individual's product with those of the group (commonality) were consistent across all products. Considering only statements mentioned two or more times, a given clinician's product for a given case included on the average, roughly half of the statements/associations mentioned during the other sessions. While this may appear to be a relatively high level of agreement between individual and group, it must be remembered that commonality was calculated on the basis of a restricted set of statements, i.e., those mentioned in two or more sessions. At most, only one-third of any product (excluding cues) would be included in the calculation. 108 Both the proportional agreement and commonality results indicated that only by aggregating products across sessions could the outlines of a meager consensus on each case be demonstrated. Intercorrelations With the exception of cues, the results across products followed a consistent pattern. The liklehood of any two clinicians, or one clinician over two sessions, agreeing on a statement/association for a given case ranged from small to non-existent. Comparison of Subjects Five t-tests for differences between means were computed to test the hypothesis that there was no difference between the Michigan and Illinois subjects in their performance on the five products. None of the tests reached statistical significance at the .05 level but the difference in cues approached significance. Michigan subjects requested fewer cues and proportionally more high inference cue forms than did the Illinois subject (0.42 vs. 0.09). Nevertheless, the differing methods did not translate into differing diagnostic performance. Statistically, the groups were virtually identical in performance on that dimension indicating that irrespective of number or form of cues collected, there was no agreed upon heuristic to guide information gathering and interpretation. Correlational Data: Individual and Group At the individual clinician level, performance on one product could not be used to predict performance on another when performance was averaged across two sessions for the case and its replicate. Cues 109 were predictors neither for the diagnosis nor for the remediation. Neither was the diagnosis a predictor for the remediation. At the group level, only diagnosis and remediation showed a modest level of association. However, cues did not predict either diagnosis or remediation even at this aggregated level. Clinician Self-Reports Each clinician reported on plans of action that she used to guide her information gathering and the writing of her diagnosis and initial remediation. An examination of teacher performance on three products (cues, written diagnoses, written remediations) revealed that the subjects never followed their stated plans. a. Cues never followed a predictable order of collection. b. Diagnostic statements never followed the order of cue collection. Whatever the order in which diagnostic state- ments were presented, no clinician repeated that order a second time. c. Remedial statements never followed the order of presentation of diagnostic statements. In sum, no clinician followed a model or heuristic powerful enough to allow for predictable collection of data and its consistent interpretation. Conclusions Two major hypotheses which guided the design of this study were weakly supported by the data. 110 l. Statements of weakness were repeated more reliably across sessions for a given case than were all diagnostic categories. 2. Statements of weakness were more often associated with treat- ments across sessions. Two additional hypotheses dealing with the nature of unassociated diagnostic and remedial statements clearly were strongly supported by the data. l. Those statements in the diagnosis that were not associated with any treatments remained unassociated because: a. they could not be remediated, b. they did not require remediation, c. the clinician meant to provide a remediation but forgot to write it down, and d. the clinician knew no remediations for those diagnoses. 2. Those statements in the remediation that were not associated with any diagnosis remained unassociated because the clinician believed that the particular remediations should always be included. A final hypothesis, that remediated diagnosis would be repeated more reliably across sessions than all diagnostic categories, was not supported. There is almost no agreement on this dimension. Assuming that the subjects in this study were representative of a larger group of similarly qualified and experienced reading specialists, what can be concluded from the data with regard to the efficacy of diagnostic procedures? 111 First, as currently practiced, the compilation and presentation of diagnostic findings is an empty, random exercise. There are as many different conclusions for a case as there are points of entry into the data. All statements beyond those few which fall within the seven general diagnostic categories are entirely idiosyncratic, hence, unpredictable. Second, all remedial suggestions beyond those few reflected by the seven general remedial categories are idiosyncratic and may or may not (as chance dictates) have any impact on the child's performance. Given the very low level of diagnostic predictability demonstrated in this study, it can be argued that diagnostically derived remedi- ation is neither time efficient nor economically justifiable. Any combination of remediations chosen from the seven categories might do equally well in addressing the actual reading problem(s) of a student. Recommendations A number of options present themselves as policy alternatives. (1) The case for the ultimate desireability of the current diagnosis-treatment model could continue to be pressed. The data in this study unambiguously refuted the effectiveness of this approach. A plan of instruction based on an ad hoc diagnosis cannot be defended. It speaks neither to the child's legitimate educational needs nor to the issues of the time and cost of preparing the diagnosis to begin with. (2) Diagnosis as a necessary prerequisite to remediation could be eliminated in favor of simply choosing those remediations that work 112 most of the time for most children at a given level of reading proficiency. Unfbrtunately, we have no long-term empirical data describing the problems that characterize most problem readers at a given age and grade level nor the remediations of choice for those problems. We are, therefore, still left with less than optimal con- trol over which remediation or combination of remediations is the best starting point for a student. (3) A reading-performance diagnostic model could be developed that concerns itself first and foremost with evaluating reading performance against a commonly accepted metric which would include, in a standard sequence, the tasks embedded in the diagnostic cate- gories. Instead of collecting information for clues that might shed some light on the particular difficulties that characterize an individual student, the procedure would be reversed. The reading- performance diagnostic model would dictate what information should . be collected and in what sequence. By assuming that most children's deficiencies can be grouped within a small number of categories, cue collection becomes routinized and initial remediation follows directly from the information collected. Those anomalous children who do not respond (within specified time boundaries) to standard remediation(s) would be re-evaluated. Which of the three models would lead to significantly better outcomes in improved reading proficiency? This is an empirical question which is beyond the scope of this investigation. Neverthe- less, it can be suggested that the use of controlled clinical trials, so common in other disciplines, could be established. University 113 clinics could be used as sites for such long-term investigation, with incoming students being randomly assigned to remediation: a. that derives from the traditional two or three hour diagnostic work-up, or b. that is not preceded by any individualized diagnosis but that progressively clarifies the problem, thereby producing the diagnosis, or c. that derives from a reading-performance model of diagnosis. Outcome data would be collected by people blind to the experi- mental condition of a given child. In "The Structure of Scientific Revoluations," (in Neurath, ed.) Thomas Kuhn says: ...the emergence of new theories is generally preceded by a period of pronounced professional insecurity...That in- security is generated by the persistent failure of the puzzles of normal science to come out as they should. Failure of existing rules is the prelude to a search for new ones. The paradigm in the field of reading diagnosis and remediation has been that diagnosis and remediation ought to be integrally related. The major conclusion of this study is that they are not. Kuhn says: Just because the emergence of a new theory breaks with one tradition of scientific practice and introduces a new one conducted under different rules and within a different universe of discourse, it is likely to occur when the first tradition is felt to have gone badly astray. The results of this study directly challenge the reigning paradigm in the field of reading diagnosis/remediation. If these results are substantiated, they would invalidate a large part of 114 the training of reading diagnosticians together with the textbooks which they study. "Paradigm-testing," Kuhn argues, "occurs only after persistent failure to solve a noteworthy puzzle has given rise to crisis." In this thesis, the puzzle is that the reading diagnosis/remediation paradigm does not give rise to any generalizable predictive metric exemplified in the work of the "community of practitioners" (Kuhn's description) among whom are the reading clinicians in this study. APPENDICES APPENDIX A CUE INVENTORY - CASE 4 115 memom mpmom thom Femom Aummh mcpummm Paco commopmv amp; ago: uwumcu mpxmuz e—mmuz m_mm~3 opmum ppau ppmmmz ummHz upmum wucmsgowgmm mmmfiz pmmuz upmum poncm> Aummw>mm cmcupvgu Low mpmum mocmov~Pma=H gapmguazv suppan< a>apwcaou hzuzmmumm< QMNHoa=u :o_umscow=H ammo c mmmopzu>zH mau < x~ozmm¢< 116 cemeeELemem ee mace» .emee amaze amaze unmae _ em_4 ..Peam meaeeae;ee=e mmaee mmmae «maze Pmmee useswge geese: we zeeEez Feem_> mwzee smmee emmee mmmee agesvce -meeez e? meeeem mevcee: memee mmeee meee emmee epeee m_m»_m=< ego: eee sewewemeeem use: upmee mpeee «Pane meeee =o_m=a;mzasee eeeeaem_e .Paee epeee maze emee Kmee eeeeaam ecapem mmee exec mmee Naee .mee eeeeemm .mce Ap_eec=ev >e_3e_mmwe meweeem we memapee< ewemeemewe meze maze Peze emah sauceuu<\eaaam Aeeeeeeeemz meuewv pewse>ewgu< neveeem mmmpe mmmPH mmmem “mmpm emeeeem meweeem mmmpe mmmpe Nmmhu .mmhu emeeeem xgepeeeue> Amp_wxm u_mem we peep ezeHv “mob eeese>mrgu< maee made sage meeeemeeee muse ease Naee Fuse scepenaeo> Fe>ee eeepeesgeeeH .mewcem meveeemnmeweeemye pFeegee Aeaaeeeeeev hzmzmmmmm< eu-ea eemwm ewmem zowwueezee mezmwzmm wwxzw epxze mpgze ewxze mwxze meeeem segue: mew>we prze epxze exze exze Nxze ee.e=epe :eeeFeze mxzo exze mxze Nxze sza mace; ego: eesEee meweeewm eee .meeem memew ewemeeeewe neweeem eewwwxez meeee mew: mew: mew: Fem: Apmew .Ewcemwe :geewee< eesee:v eeweeewsweemwe :geewee< wzmzmmmmm< emNHem NmH> “new eewmw> me=< ~e=< eeeeem ewguesewee< onw(Hopefully he will be able to— discriminate among similar words.) Tim is relying too heavily on the —-f ‘ VS beginnings of words.>420 mHmozw0) = maximum number of matches within cells a and d, holding marginals constant. Phi maximum (<0) = maximum number of matches within cells d and c, holding marginal constant. Since the Phi coefficient is bounded by the marginal frequencies of the agreement table, correction is usually recommended. Phi corrected = observed value of Phi absolute value of Phi 154 Inter-Clinician Correlation An example of a completed table is as follows: Domain Statements of Statements of Statements Clinician A, Clinician B, (Diagnosis, SIMCASE Q SIMCASE Q Remediation or Cue) Sl Sl Sl $2 52 52 S3 S7 S3 S4 55 56 S7 Clinician A SIMCASE Q + - ‘2: + 2 l 3 (U 32 (a) (b) 2% £33; - 1 3 4 (cl, (d) 155 Intra-Clinician Correlation An example of a completed table is as follows: Statements of Clinician A, Statements of Clinician B, Domain Statements (Diagnosis, SIMCASE Q SIMCASE Q Remediation or Cue) Sl SI 51 $2 $2 $2 53 S7 53 54 SS 56 S7 Clinician i SIMCASE Q Form One 4. :éo 3 3 + 2 5;“- (a) (b) FEE-’5 - l 4 F-HLL. 0” (c) (d) 3 7 APPENDIX 0 MASTER DIAGNOSIS AND REMEDIATION 156 APPENDIX 0 MASTER DIAGNOSIS AND REMEDIATION Case 3 His sight vocabulary and word recognition behaviors appear reason- ably intact for his present grade placement. The slightly depressed scores (6.8+) are probably caused by: 1) Insufficient independent reading. He is not a reader in the sense that he does very much of it. This alone might account for the slightly reduced sight vocabulary. 2) Visual screening suggests that he is to some degree farsighted. This acuity problem might in turn account for the lack of practice described in 1 above. REMEDIATION: He should be referred to an optometrist/ophthal- mologist for complete visual testing. Also, parents and school should create a support system (library card, reading time and place, feedback mechanism such as parent discussion, etc.). That would encourage him to develop a reading habit. Both quantity and type of reading should be carefully followed for purpose of feedback until such time as he is able to sustain the habit on his own. He has a significant problem with the higher level decoding skills and their application during reading. This problem is caused by: l) Some real confusion with multi-syllable - affixed words. He is not secure with many vowel phonograms used (as syllables) and does not give reasonable approximations to these sound values. REMEDIATION: These responses can be taught through daily analysis of the multi-syllable words using worksheets - tape recorders - or teacher cues in the sound values. Extended reading behavior (see #1 sight voc.) should also aid in learning to sound these confused letter patterns. 2) An even greater problem is his inability to modify or adjust a sounded word into its spoken representation. Even when he correctly segments and sounds a multi-syllable word, he will often misplace a primary accent or use a long vowel sound value the actual word demands a short or r controlled. His final recognition seldom is adjusted in these areas leaving the word unrecognized or a nonsense form. He does not seem to syste- matically demand that sounded words he read in terms of his own meaning - speaking - listening vocabulary. 157 He has a serious problem with fluent texting of contextual material. This is caused by: l) Inappropriate responses to punctuation. REMEDIATION: Model the use and purpose of punctuation in reading. Use simultaneous sight reading of apprOpriate material (5-6 grade equivalent) in which he attempts to match the rhythm, cadence, and intonation of the model. Five minutes a day of such practice should be sufficient. 2) A suspected deficit in language development. He does not speak well in that his flow of language is not smooth. This appears during initial language learning and probably has some impact on his reading fluency. REMEDIATION: Encourage verbal discussion using play reading in which he can rehearse smooth and fluent language behavior. He has a reading comprehension problem specifically related to the meaning demands of saphisticated content related material. This is caused by: _ 1) His apparent inability to schematize or structure information for potential inferences and memory. He appears to process details (information) but does not easily frame these for inferred relationships or as a memory cue. REMEDIATION: Preface reading assignments with self generated questions and/or insights that can serve as a guide and referent for the information and thinking that should take place during reading. This process might also be modeled with the teacher demonstrating through "talking out" while reading the referent frames that are generated. Careful questioning with referral to material for answer cues can also create the appropriate thought processes. This again is a demonstrated technique. APPENDIX P PROPORTIONAL AGREEMENT FIGURES (l) Diagnostic Statements (2) Remediation Statements (3) Cue Forms (4) Remediated Diagnoses (5) Remedial/Diagnostic Associations 158 APPENDIX P PROPORTIONAL AGREEMENT FIGURES The horizontal axis in all five figures represents, as propor- tions, the number of sessions at which types of statements were made. That is: 0.17 = made during one session 0.33 = made during two sessions 0.50 = made during three sessions 0.66 = made during four sessions 0.83 = made during five sessions 1.00 = made during all six sessions The vertical axis represents the proportion of each of the various types of statements made during each session. Looking at the figure for diagnostic statements, for example, statements coded as observations accounted for 0.25 (25 percent) of the diagnostic state- ments made in only one session (O.l7 on the horizontal axis). However, statements coded as observations accounted for only 0.02 (two percent) of the diagnostic statements which were repeated across four sessions (0.66 on the horizontal axis). 1.00 0. 60 0. 70 PROPORTION OF TYPES OE sEJIIIAGNOSTIC STATEMENTS L I L I 0.20 0.30 0140 0.10 c0. OO .00 0:17 159 PROPORTIONAL AGREEMENT: DIAGNOSTIC STATEMENTS STRENGTH = SQUARE NEAKNESS = X OBSERVATION = TRIANGLE 0:33 0:50 0:68 ofea 1.00 PROPORTIONAL AGREEMENT PROPORTION OF TYPES OF REMEDIATION STATEMENTS 160 DROPORTIONAL AGREEMENT: REMEDIATION STATEMENTS STRENGTH = SQUARE WEAKNESS = x 8 OBSERVATION a“ TREATMENT = 1.00 TRIANGLE Z 0.8 0.70 0.60 0.50 0:40 0:30 o. 20 0.10 . . .00 0:17 0:83 1.00 cp.00 0:33 0:50 ' 0:66 PROPORTIONAL AGREEMENT PROPORTION OF TYPES OF CUE FORMS 1.00 161 PROPORTIONAL AGREEMENT PROPORTIONAL AGREEMENT: CUE FORMS SCORES/COMMENTS = SQUARE a BOOKLETS/TAPES = x 5“ BACKGROUND = TRIANGLE 8.4- [I g... 3... a B... 8.x 5H. 3.. A c0.00 o 17 E33 ofso 066 0:83 E 00 1.00 0.40 0.70 0.80 0.60 mm: PRPQPRTION OF TYPES OF REMfiP§ATED DIAGNOSTIC STATEMENTS 0g0 162 x PROPORTIONAL AGREEMENT: REMEDIATED DIAGNOSES STRENGTH SOUARE UEAKNESS X OBSERVATION = TRIANGLE c0.00 .00 0:17 0:33 0:50 _ofss 0:83 1300 PROPORTIONAL AGREEMENT 1.00 0.80 PROPORTION OF TYPES OF RX/DX ASSOCIATIONS mm: 0.20 0.30 0.40 0:50 0160 0.70 ago 163 PROPORTIONAL AGREEMENT: RX/DX ASSOCIATIONS TREATMENT/UEAKNESS = SQUARE UEAKNESS/UEAKNESS = X TREATMENT/OBSERVATION= TRI NGLE OBSERVATION/UEAKNESS = Z 'OBSERVATION/OBSERVATION - OCTAGON capo dso d £83 0317 0:33 66 PROPORTIONAL AGREEMENT Eco APPENDIX Q INTERCORRELATION RANGES 164 “Ne.ee :ee.ee ANN.ee :ee.ev Ape.ev Awe.ev Aa~.ev :ee.ev Ae~.ee Ame.ev ee.e em.e- mm.e ee.e _e.e ee.e- mm.e me.e em.e me.e e amae Ame.ev :ee.ev :m..ev :e.ev :em.ev Amp.ev :m_.ev :e.ev :em.ee :e.ev e..e- mm.e- _N.e me.e- me.e me.e- aN.e ee.e- eN.e me.e- m amee Ame.ev :ee.ev :e:.ee :ee.ev :ee.ev :__.ev :eN.ev Ame.ev :_~.ev :e.ev me.e- mm.e- em.e Ne.e- ee.e ae.e- am.e Ne.e mm.e ee.e- N aaee hee.ee :ee.ev ANN.eV :ee.ev Awe.ev :m..ee Ae~.ee :ee.ee Ae~.ev Awe.ev e_.e ee.e- mm.e me.e- Ne.e ee.e- me.e me.e me.e me.e _ amee .xez .e_z .xez .ePz .xez .sz .xez .e_z xez .ewz meeweeweemm< memeemewe eeo :ewueweesem mwmeemewe memeemewe\eeueweesem empoweeeem Amwmeeeeecee ew czeem mw ewemweeem seeseev mmwz