ABSTRACT OBJECTIVE MEASUREMENT OF CERTAIN FACIAL MOVEMENTS DURING PRODUCTION OF HOMOPHENOUS WORDS by Lowell J. Sahlstrom This study was concerned with the objective measurement of movements present on the surface area of the face as speakers of both sexes articulate homophenous words. It was the purpose of this investigation to examine the effect of speaker sex upon the amount and pattern of movement occurring on certain areas of the face while speaking selected groups of so-called homophenous words. Secondly, it attempted to study differences in certain objectively measured facial movements among selected phonemes in the English language that are said to be homophenous. This investigation also examined the effect of the position of the phoneme within a word upon certain facial movements that accompany the production of the>word. Five female and five male speakers spoke a list of stimulus words constructed around the homophenous consonant clusters /p, b, m/, /t, d, n/, and /tf, (13, f/. Each cluster contained three homOphenous words in which each of the consonants was in the initial position and three Lowell J. Sahlstrom words in the final position of the word, for a total of 5“ words. A mercury-rubber strain gauge attached to the sub- ject's face reflected facial movement on that area of the face in a change in electrical resistance of the gauge. This electrical resistance change was transduced into a graphic tracing on oscillograph recording paper. Each resultant tracing of a word was analyzed in terms of six individual measures that attempted to characterize that curve. Male speakers were found to present greater intensity of facial movement over the total duration of a word than female speakers across all consonant sounds and word position. The /p/ and /b/ consonants were found to be consistently associated with greater intensity of facial movement; more changes in pattern of facial movement; and greater elapsed time to changes in facial movement, than was the /m/, especially when those consonants appeared in the final position of a word. The /p/ differed from the /b/ in number of changes in pattern of facial movement. The /tf/ and /d3/ were found to have greater elapsed time to changes in movement pattern than did the /f/ in the final position of a word. No differences were found among the /t, d, n/ cluster. It was concluded that differences in facial move- ment exist between male and female speakers that require Lowell J. Sahlstrom the attention of those persons concerned with teaching of lipreading and the testing of lipreading performance. It was also proposed that differences in facial movement found among certain so-called homophenous consonants indicate the need for a new definition of homOphenous sounds, and that these differences could be expected to provide visual cues to the lipreader to aid in distinguishing among words using these sounds. It was suggested that a common element in the differences in facial movements found among these consonant sounds may be that the plosive element of certain sounds results in increased intensity of facial movement and/or changed pattern of facial movement. Further research was suggested to identify more specifically what these differences are so that they may be taught to the student of lipreading, along with other recommendations for continued investigation of facial movements. OBJECTIVE MEASUREMENT OF CERTAIN FACIAL MOVEMENTS DURING PRODUCTION OF HOMOPHENOUS WORDS By Lowell John Sahlstrom A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Speech 1967 ACKNOWLEDGMENTS This research was supported in part by a pre- doctoral study traineeship from the office of Neurological and Sensory Diseases Service Program of the Public Health Service. Recognition is given to Herbert J. Oyer, Professor of Speech and Chairman of the Department of Speech, for his interest and the guidance he gave as academic and thesis advisor. Special gratitude is expressed to Thomas Adams, Associate Professor of Physiology, for his cooperation and assistance. Sincere appreciation is extended to my wife, Lavonne, for her continuing encouragement and patience. i1 TABLE OF ACKNOWLEDGMENTS . . . . LIST OF TABLES . . . . LIST OF FIGURES Chapter I. INTRODUCTION . Statement of the Problem and Study . . Importance of the Study Definition of Terms Organization of the Report II. REVIEW OF THE LITERATURE The Code . . The Speaker . Facial Movements Summary . . CONTENTS Purpose of III. SUBJECTS, EQUIPMENT, AND PROCEDURES Subjects Equipment . Stimulus Material . Pilot Study Experimental Procedures Measurements IV. RESULTS AND DISCUSSION Discussion . . . Summary . V. SUMMARY, CONCLUSIONS, AND RECOMMENDATIONS Conclusions . . . Implications for Further Research . BIBLIOGRAPHY . . . . . APPENDIX A . . . iii Page ii iv vii FJH ounxun 18 19 A8 60 7O 72 72 72 72 73 81 86 11a 120 126 131 133 137 1A5 LIST OF TABLES Page Homophenous Word List . . . . . . . 74 Summary of analysis of variance comparing differences in temporal summation to inflection points among the consonants /p, b, m/ as a function of speaker sex and of word position . . . . . . . . . 88 Summary of analysis of variance performed to determine whether summation of amplitudes at inflection points differed among the consonants /p, b, m/ as a function of‘ speaker sex and of word position . . . 91 Summary of analysis of variance performed to test differences in the integrated amplitude-duration measure among the con- sonants /p, b, m/ as a function of speaker sex and of word position . . . . . . 94 Summary of analysis of variance comparing differences in total duration among the consonants /p, b, m/ as a function of speaker sex and of word position . . . 96 Summary of analysis of variance performed to determine whether area under the curve differed among the consonants /p, b, m/ as a function of speaker sex and of word position . . . . . . . . . . . 97 Summary of analysis of variance performed to test differences in the number of inflection points among the consonants /p, b, m/ as a function of speaker sex and of word position . . . . . . . . 99 Summary of analysis of variance comparing differences in temporal summation to inflection points among the consonants /t, d, n/ as a function of speaker sex and of word position . . . . . . . . lOl iv Table 9. 10. ll. l2. 13. l“. 15. 16. Page Summary of analysis of variance performed to determine whether summation of- amplitudes at inflection points differed among the consonants /t, d, n/ as a function of speaker sex and of word position . . . . . . . . . . . . 102 Summary of analysis of variance performed to test differences in the integrated amplitude-duration measure among the consonants /t, d, n/ as a function of speaker sex and of word position . . . . 10A Summary of analysis of variance performed to test differences in total duration among the consonants /t, d, n/ as a function of speaker sex and of word position . . .-105 Summary of analysis of variance comparing differences in area under the curve among the consonants /t, d, n/ as a function of speaker sex and of word position . . . . 106 Summary of analysis of variance performed to determine whether number of inflection points differed among the consonants /t, d, n/ as a function of speaker sex and of word position . . . . . . . .- . . 106 Summary of analysis of variance comparing differences in temporal summation to inflection points among the consonants /tf, d3, f/ as a function of speaker sex and of word position . . . . . . . . 108 Summary of analysis of variance performed to test differences in summation of amplitudes at inflection points among the consonants /tf, d , f/ as a function of speaker sex and 0 word position . . . . 109 Summary of analysis of variance comparing differences in the integrated amplitude- duration measure among the consonants /tf, d , f/ as a function of speaker sex and of word position . . . . . . . . . .-110 Table l7. l8. 19. Summary of analysis of variance comparing difference in total duration among the consonants /tf, d , f/ as a function of speaker sex and 0 word position . .~ . Summary of analysis of variance performed to test differences in area under the curve among the consonants /tf, d‘, f/ as a function of speaker sex and of ord position . . . . . . . . . . Summary of analysis of variance comparing difference in number of inflection points among the consonants /tf, d , f/ as a function of speaker sex and of word position . . . . . . . . . . . vi Page 111 Ill . 112 Figure 1. LIST OF FIGURES Diagram of strain gauge placement vii Page 78 CHAPTER I INTRODUCTION In the process of normal communication, the articulatory organs of the speaker are moved into various positions to produce consonants and vowels. These consonants and vowels, when organized into established word patterns, reach the ear of the listener and are interpreted by him. These words made up of established patterns of consonants and vowels evoke meaning in the listener. They are the means of conveying the speaker's message to the listener by way of auditory signals that are received by the listener's ear which, in this case, is the primary sensory receptive organ for communication between persons. For the person with normal or nearly normal auditory sensitivity and acuity, such a system of oral communication serves quite adequately and very efficiently. For the hearing handicapped person, however, this system is not. adequate. That is, some part of the auditory reception. organ does not function properly, and the signs of the speaker are not received and/or interpreted. Normal communication cannot take place as a result. Because the auditory reception system of the listener may not 1 receive or be able to utilize the full sound power level of the speaker's voice, a reduced intensity level is perceived and parts of the words, the stimuli which carry the message of the speaker, are lost. In addition, the auditory system of the listener may in some way distort the acoustic signal received from the speaker so that clear perception of the vowels and consonants is not accomplished. In this event, the distortion of the stimulus causes a part of the message to be misinterpreted or misunderstood or not understood at all. Consequently, the thoughts of the speaker are not transmitted accurately to the brain of the listener because of some kind or degree of malfunction of the auditory reception organs of the listener. A third possibility for interference with the normal system of communication consists of adverse conditions in the environment, the intervening medium between speaker and listener through which the message must travel. These conditions may produce a reduction of the acoustic stimulus intensity or a distortion of the stimulus. An example of such a situation would be one in which oral communication is attempted under conditions of excessive ambient noise as in a factory, a loud cocktail party, etc. Here the acoustic code may be partially masked by noise. It is possible that some elements of the words such as the consonants of lesser acoustic power are masked by the noise while the remaining phonemes are not masked. The result is a distortion of the code that causes the listener to perceive inaccurately, the message transmitted by the speaker. In each of the above conditions (the deaf, the hard- of-hearing, the adverse noise), the normal system of oral communication becomes less than adequate and in some cases, totally inadequate. An additional or alternate system of communication is required for transmission of symbols from the Speaker to the listener, or receiver. Lip- reading, speech reading, visual hearing, and visual communication are all terms that have been used to refer to such an alternate system of communication. In this system of communication, the visual system of the receiver becomes the primary organ for the reception of the code or stimuli with audition now functioning in a secondary role. In some cases, such as in conditions of noise or with the person with a mild hearing loss, audition remains the primary receptor and vision is a secondary receptor assist- ing in the accurate interpretation of symbols transmitted by the speaker. The term lipreading is commonly used for this system of communication but is somewhat misleading and is too restrictive because information is obtained through visual means, other than from the lips of the speaker. Information is also obtained from movements of the jaw, gestures, facial expressions, and so on. Now, the articulatory organs of the speaker modify the breath stream and vocal tone of the speaker into vowels and consonants that carry information; and the shape and movements of the articulators also carry information via the visual system. Thus, both auditory and visual stimuli are being transmitted by the speaker, and both kinds of stimuli have the potential of conveying symbols to the receiver. Oyer suggests a revised definition of lipreading: "The correct identification of arbitrary symbols in a system common to a speech community, transmitted via the visual components of oral discourse."l Morkovin2 has stated that the purpose of speech reading is to restore the ability of an aurally handicapped person to understand speech. The weakened or distorted auditory stimuli must be reinforced or compensated for by learning to attach meaning to visual stimuli in the light of the whole speech situation. O'Neill and Oyer note the similarity between auditory and visual performance. They suggest that: A subject views a stimulus and attributes organization or meaning to what he sees. In other words, he views lip movements with the intent to understand the thoughts of the speaker, and attention is directed toward comprehension rather than mere recognition. 1Interview with Herbert J. Oyer, Chairman,Department of Speech, Michigan State University, Jan. 9, 1967. 2Boris Morkovin, "Rehabilitation of the Aurally Handicapped Through Study of Speech Reading in Life Situations," J. Speech and Hearing Disorders, XII (December, 19A7), p. 363. 3John J. O'Neill and Herbert J. Oyer. Visual Commun- ication for the Hard of Hearing (Englewood Cliffs, N. J.: Prentice-Hall, Inc., 1961), p. 6. There has been an increasing interest in the use of visual speech reception as an aid to, or as a substitute for, normal auditory speech reception, in aurally handicapped persons. Lipreading, or visual speech reception has been brought to the attention of the public and has found a place in the education and rehabilitation of the aurally handicapped. Many people have profited by training in lipreading; and most people, even with normal hearing, utilize lipreading to some extent in everyday oral communication. However, much remains to be discovered about the lipreading process itself. By far the majority of the published research in lipreading has pertained to the variable of the lipreader, or receiver. Very little experimental work has been attempted regarding the speaker variable. The variable of the code, or stimulus material, has been of interest for many years but only recently has begun to be the subject of controlled research. O'Neill and Oyer have stated: "This area seems to offer the greatest possibility for future controlled research."u Statement of the Problem and Purpose of the Study This study was concerned with the objective measurement of facial movements of speakers when articulating homophenous “O'Neill and Oyer, op. cit., p. A7. words. It was anticipated that this project would demonstrate that objective measurement of facial movements during the process of speaking is possible and would thus be a measure of the visual cues transmitted by the speaker. It was the purpose of this study to investigate the effect of speaker sex upon the amount of movement occurring on certain areas of the face while speaking selected groups of so-called homophenous words. Secondly, this study attempted to examine possible differences in certain objectively measured facial movements among selected phonemes in the English language that are said to be homophenous. The effect of the position of the phoneme within a word upon possible differences in.cer¢ain facial movements was also examined. Finally, it was asked whether such differences in certain facial movements would be determined by six individual measures of those movements. In order to examine these variables, the following null hypotheses have been formulated: 1. There are no significant differences in certain facial movements among the three homophenous consonants /p/, /b/ and /m/ as a function of speaker sex and of word position as determined by six individual measures. 2. There are no significant differences in certain facial movements among the three homophenous consonants /t/, /d/ and /n/ as a function of speaker sex and of word position as determined by six individual measures. 3. There are no significant differences in certain facial movements among the three homophenous consonants /tf/, /d;/ and [I/ as a function of speaker sex and of word position as determined by six individual measures. Importance of the Study The experimental study of the lipreading process can be divided into four broad categories of variables including (1) the receiver, or lipreader; (2) the environment; (3) the code, or stimulus material; and (A) the speaker, or sender. The receiver category would include such factors as visual perception and acuity, personality, intelligence, behavior characteristics, etc. Among the studies pertinent to this aspect of the lipreading process (the lipreader) are those 6 O'Neill reported by Heider and Heider,5 Brannon and.Kodman, and Davidson,7 and Simmons.8 Little research has been done on the second variable, the environment. Studies published by Neely9 and by Mulligan 10 are relevant to this topic. Some investigation 5F. K. Heider and G. M. Heider, "An Experimental Investigation of Lipreading," Psychological Monographs, LXI (February, 19A0), pp. l2A-153. 6John B. Brannon and Frank Kodman, "The Perceptual Process in Speech Reading," A.M.A.-Archives of Otolaryngology, LXX (January, 1959), pp. llA-ll9. 7John J. O'Neill and Jo Ann Davidson, "Relationship Between Lipreading Ability and-Five Psychological Factors," J. Speech and Hearing Disorders, XXI (December, 1956), pp. A78-A81. 8Audrey Simmons, "Factors Related to Lipreading," J._Speech and Hearing Research, II (December, 1959), pp. 9Keith Neely, "Effect of Visual Factors on the Intelligibility of Speech," J. Acoust. Soc. Am., XXVIII (June. 1956), pp- 1275-1277. lOM. Mulligan, "Variables in the Reception of Visual Speech from Motion Pictures," (unpublished Master's Thesis, Dept. of Speech, Ohio State University, 195A),cited in O'Neill and Oyer, op. cit., p. A3. in the use of television for lipreading instruction and 11 Smith,12 and Larrl3 reported by Oyer, are also repre- sentative of the experimental work in this area. The third experimental variable stated is that of the code (the stimulus material) in the lipreading process. Included in this category are such factors as the difficulty and familiarity of the code units; the rate at which the code units are presented; the visibility and similarity or lack of similarity of the units; and the relative contribution of the various subunits (consonants, vowels) to the lipreadability of the unit. Representative of the research in this area are studies reported by 1“ O'Neill15 and Woodward.16 Taafe and Wong, The final category in the lipreading process is that of the speaker (the sender) variable. Of interest to 11Herbert J. Oyer, "Teaching Lipreading by Television," Volta Review, LXIII (1961), pp. 131-132. 12Robert Smith, "Let's Lipread: Television Production Criteria," Am. Annals of Deaf, CX (November, 1965), pp. 571- 578. l3Alfred Larr, "Speechreading Through Closed Circuit Television," Volta Review LXI (January, 1959), pp. 19-21. l”Gordon Taafe and Wilson Wong, "Studies of Variables in Lipreading Stimulus Material," John Tracy Clinic Research Papers, III (December, 1957). 15John J. O'Neill, "Contributions of the Visual Components of Oral Symbols to Speech Comprehension," J. Speech and Hearing Disorders, XIX (December, 195A), pp. A29-A39. l6Mary Woodward, "Linguistic Methodology in Lipreading Research," John Tracy Clinic Research Papers, IV (December, 1957). research in this category are such factors as the amount of gesture activity, facial expressiveness, rate of speaking, amount and place of facial movement during speaking, speaker variability, and differences between speakers. Experimental work on this variable has been performed and reported by 18 Byers and Lieberman,19 O'Neill and others. Stone,l7 Fusfield, One of the major obstacles to the study of the lip- reading process has been a lack of an adequate means of quantifying the amount of information available on the face of the speaker or, in other words, the amount of information transmitted by the speaker. Oyer,21 in a pre- sentation before a seminar in aural rehabilitation at Michigan State University, raised the question of why so little research has been directed to the area of aural rehabilitation. He suggests that there are five factors operating: (1) lack of interest, (2) lack of constructs, (3) inadequate research preparation of many engaged in the l7Louis Stone, "Facial Cues of Context in Lipreading," John Tracy Clinic Research Papers, V (December, 1957). 18Irving Fusfield, "Factors in Lipreading as Determined by the Lipreader," Am. Annals of Deaf. CIII (March, 1958), pp. 229—2u2. 19V. w. Byers and L. Lieberman, "Lipreading Perfor- mance and the Rate of the Speaker," J. Speech and Hearing Research, 11 (September, 1959), pp. 271-276. 2OO'Neill, "Contributions .». .," loc. cit. 21Herbert J. Oyer, "Research Needs in Aural Rehabilita- tion," Aural Rehabilitation of the Acoustically Handicapped, Department of Speech, Michigan State University, SHSLR-266 (East Lansing, Michigan 1966), pp. 133-1Al. 10 the field, (A) lack of adequate test instruments, and (5) difficulty in isolating and controlling variables. Mason22 in the early 19A0's suggested that one possible explanation for the lack of objective measurement may be found in the existence of individual differences in visible speech manifestations exhibited by various speakers. She stresses the need for objective and adequate measurement. Lowell lends support to the fourth point mentioned by Oyer when he stated: What we need more than anything else at this time is the development of measuring instruments. Until we get the yardsticks comparable to those in the physical sciences we are not going to make the progress that they have.2 The latter two factors stated by Oyer are of Special significance to the present study. The lack of adequate instrumentation has presented unusual difficulty in studying the speaker variable in the lipreading process. This lack of instrumentation has resulted in Oyer's final factor, the inability to isolate and control the variables to be studied. In normal Speech, facial movements take place much too rapidly to be observed and studied by casual observation alone. A method has been needed which would, in effect, stop 22Marie K. Mason, "A Cinematographic Technique for Testing Visual Speech Comprehension," J. Speech and Hearing Disorders, VIII (September, 19A3), pp. 271-278. 23Edgar Lowell, "Research: Needs and Goals," Auditory Rehabilitation in Adults, Cleveland Hearing and Speech Center, Western Reserve University (Cleveland, Ohio, 196A), PP- 173-179. 11 those facial movements while in process to allow the experimenter to study those movements in detail. Motion picture films have been used toward this end to some extent, but this is a somewhat artificial situation and also is a very cumbersome and laborious procedure in terms of frame-by-frame analysis of facial movements. The present study was designed as a preliminary step toward the development of a more objective means of evaluation of facial movements of the speaker. This method, if successful, would allow measurement of movements of specific areas of the face to evaluate inter-speaker dif- ferences as a function of age, sex, race, etc. This would also allow objective measurement of these facial movements occurring as a function of the stimulus material, i.e., the code. O'Neill and Oyerzu have pointed out that the analysis of the stimulus materials used in lipreading is a very profitable research area and, in fact, seems to offer the greatest possiblity for research. The study of the code variable also has presented the difficulty of lack of adequate instrumentation to measure what actually happens on the speaker's face while he is speaking or producing various types of stimulus materials. In the present project, one aspect of the code was studied: the.pheomenon of so-called homophenous words. 2“O'Neill and Oyer, op. cit., p. A7. l2 Deland25 states that Alexander G. Bell introduced the words homonym and homophone to educators of the deaf in America in a series of lectures in Worcester, Massachusetts, in 187A. The term homonym denoted words that, when Spoken, appear alike to the eye; not, nod, tot, etc.; homophone denoted words that are pronounced in the same manner but Spelled differently; rain, reign, rein. Later Bell used the word homophenous,a word that is now applied to both homonyms and homophones. Bruhn26 defines homophenous words as words that look alike on the lips and a-homophene as a word that has the same appearance with respect to the visible organs of Speech as another word. She listed eight homophenous groupings of consonants in the English language. Within each group, words can be distinguished only by the context in which they are used, according to her hypothesis. Woodward,27 using a linguistic approach to the study of lipreading stimulus materials, suggested that English consonants could be classified into Six sets of homophenous clusters. Roback28 investigated the ability of viewers to 25F. Deland, The Story of LipLReading (Washington, D. C.: The Volta Bureau, 1931), p. 120. 26Martha Bruhn, The Muller-walle Method of Lip Reading (Washington, D. C.: The Volta Bureau, 19A9T. 27Woodward, "Linguistic Methodlogy . . .," loc. cit. 28Ila Mae Roback, "Homophenous Words," (unpublished Master's thesis, Dept. of Speech, Michigan State University, 1961). 13 identify homophenous words correctly. Her results indicated that correct selection of homophenous words as seen on a Speaker's lips occurred above that which is expected from chance alone. Fisher29 challenged the widely—accepted concept of homophenous words and attempted to test-the validity of this concept. His results indicated a need for a new definition of homophenous words. He found that English consonants could be grouped into five clusters of homophenous sounds. Joergenson,3O in a frame-by-frame analysis of motion picture film of speakers producing homophenous words, found that there appeared to be visible differences in mouth opening during the production of homophenous words. She also suggested that the temporal pattern of lip movements during actual production of homophenous words may be of assistance to the lipreader. It is very likely true that many words look quite Similar when spoken. However, this project challenged the idea that a viewer could not discriminate at least some so- called homophenous words because of some possible minute differences in temporal or spatial factors in facial movement. Homophenous word lists have long been used in 29Cletus G. Fisher, "Confusion Within Six Types of Phenemes in an Oral-Visual System of Communication" (un- published Ph.D. thesis, Dept. of Speech, The Ohio State University, 1963). 30Ann Marie Joergenson, "The Measurement of Homophenous Words" (unpublished Master's thesis, Dept. of Speech, Michigan State University, 1961). 1A lipreading training and in lipreading tests. Also, as O'Neill and Oyer31 have pointed out, Speakers of varying degrees of lipreadability should be included when teaching lipreading and when carrying out research.. To do so, there is need for an objective means of measuring lipreadability of Speakers. It was hOped that this study would demonstrate a means of accurately measuring and quantifying the amount of information transmitted by a speaker in terms of certain facial movements that take place during the process of speaking. This would then yield a means of evaluating objectively the visual cues transmitted by the speaker and available to the lipreader. There is a need for an objective measure of any differences that may exist in facial movements of speakers as a function of speaker sex. Such differences, Should they exist, would have implications for the instruction of lipreading and in the choice of speakers to be used in the construction of a lipreading.test. There is also a need for objective measurement of differences in facial movements among speakers of the same sex as it may relate to lipreadability of the Speaker, i.e., inter-Speaker differences. Such factorsvnnfiuiaffect the validity of a lipreading test and should be accounted for in the construction of such a test. Finally, there is a need for an objective investigation of the stimulus material in the lipreading process. This 31O'Neill and Oyer, o . cit., p. 32. 15 study examined one aspect of the code variable which has to do with stimulus Similarity, that of so-called homophenous words. Many authorities in the field have indicated the existence of many English words which cannot be differen- tiated on the basis of facial movement alone but must be distinguished on the basis of the context in which they are found. More recent research has begun to challenge the complete accuracy of this assumption. Should there be some differences between these so-called homophenous words, however minute, they need to be isolated and identified so that those differences can be labeled and utilized in the instruction of lipreading. It is believed that an objective measurement of certain facial movements of the speaker such as that attempted in this study would make it possible to Shed further light on the subject of so-called homophenous words in terms of differences in amount of information transmitted by the facial movements of Speakers producing homophenous words within various selected homophenous clusters. Definition of Terms For the purpose of this study, the terms used are defined in the following manner: Lipreading-~the correct identification of arbitrary symbols in a system common to a speech community, transmitted 32 via the visual components of oral discourse. 32O'Neill and Oyer, op. cit., p. 2. 16 Facial Movement--those movements or changes on the surface of the speaker's face as a result of the interplay of various facial muscles during the act of Speaking, as measured by a mercury-rubber strain gauge in conjunction with a plethysmograph and polygraph. Visual cueS—-the facial movements occurring on a speaker's face during the act of Speaking that provide a clue, to the receiver, to the code being transmitted during the act of Speaking. Hompphenous words--those words that, when produced by a Speaker, appear to present the same set of visual cues, i.e., that appear to look alike on the speaker's face. degg-the visual signals that convey information in language units to the receiver and have the potential of making sense to the receiver, i.e., of evoking meaning in the receiver. Organization of the Report Chapter I contains the statement of the problem that led to this study. It has included an introduction to the project and a statement of the purpose of the study. It has set forth the hypotheses to be considered; has noted the importance of the study; and has defined the terms to be used throughout. Chapter II reviews the literature pertaining to the speaker variable in the lipreading process, homophenous words within the context of the code variable, and the measurement of facial movements. 17 Chapter III presents a description of the subjects and equipment used in the experiment and a discussion of the procedure followed in conducting the experiment. Chapter IV is concerned with a presentation of the results of the experiment and a discussion of the results. Chapter V consists of a summary statement and conclusions drawn from the results of the study, together with implications for future research. CHAPTER II REVIEW OF THE LITERATURE O'Neill and Oyer have defined lipreading as: "the correct identification of thoughts transmitted by the visual components of oral discourse."33 The basic assumption of this definition is that facial movements as they occur during and as a result of the process of Speaking, transmit information. The use of the term 'lipreading' is misleading because the process of understanding Speech by Visual means takes in not only 1j4>movements'but also other facial move— ments as well. Historically, it was apparently believed that only lip movements were important, thus the term lipreading was used. As more became known of this process, other terms have been suggested and employed in attempts to describe more accurately the process of visual communication. Bruhn defined lipreading as: "the art of understanding a Speaker's thought by watching the movements of the lips and other organs of Speech."3u She described the course of study as one which involves: "Training the eye to perceive, distinguish and combine the externally visible characteristic 33O'Neill and Oyer, op. cit., p.2. 3A Bruhn, Volta Bureau, loc. cit. l8 l9 movements of the organs of Speech."35 Nitchie described the process when he said: One can watch the mouth of a Speaker and see many clearly defined movements of the lips, and even the tongue which the eye must learn to associate with certain sounds to interpret these movements into words and sentences.3 The Code Oyer has outlined several aspects of the code to be considered when studying the process of oral language communication. The two basis aspects are said to be the auditory and the visual, the latter being of primary interest to research on the process of lipreading. Factors that Oyer lists as common to both areas and thus available to investigation from the visual aspect are: redundancy and contextual influence; stimuli groupings in the sense of isolated words, sounds, phrases, or sentences; Speed of presentation; the background noise against which the language-units are presented; and the amount of information carried by the units.37 Black and Moore point out that speech has visual components that are important adjuncts to much speech.38 They label most visible action as gesture, including hand and lip movements. It is striking that sound (the auditory 35Ibid. 36Elizabeth Nitchie, New Lessons in Lip—Readipg (New York: J. B. Lippincott Co., 1930). 37Herbert J. Oyer, "An Experimental Approach to the Study of Lipreading," Proceedings of the International Congress on Education of the Deaf (Washington: U. S. Government Printing Office, 196AI, p. 322-326. 38J. W. Black and W. Moore, Speech: Code, Meaning and Cpmmunication (New York: McGraw-Hill Book Co., 1955). 2O stimuli) and Sight (the visual stimuli) are performing the same function-—they reveal meaning. By watching the lips and other gestures, Speech can be understood. Black and Moore emphasize that gesture, the physical accompaniment of word language, is linked to the essence, the substance, the content of talking. They go on to say: "Gestures--manual, facial, or voca1—-the visible abstractions of concepts, may be studied and taught--they reduce the uncertainty of the listener by carrying meaning."39 GoldsteinLl0 also supports this notion when he suggests that the pantomime of language is as Vital to the Speech reader as is the formation and movement of the lip configurations. Wyatt appears to present a somewhat opposing view of the lipreading process by separating lip movements from other visible action and the relative contribution of these different movements. She states that the lipreader must become conscious of the shape and movements seen on the mouth in relation to their sound, concentrating first on the speaker's lips.“l Later, this author indicates, the lipreader will attempt to take in the general facial expression as a guide to the type of conversation, the general attitude of the speaker., In the total lipreading process, She implies 39lbid. quax Goldstein, Problems of the Deaf (New York: The Laryngoscope Press, 1932). ulOlive Wyatt, Lipreading (London: The English Universities Press, Ltd., 1960). 21 that the literal meaning of words is received from the speaker's lips, while other movements and gestures indicate the mood and tenor of the Speaker and his thoughts and meanings. It would be expected that the visual accompaniments of Speech, the facial movements and gestures, would vary from one language system to another. Different languages make use of the articulators in different ways. Lotz reports that Danish articulation is slack while English articulation is vigorous but largely limited to vertical move- ments; and in French there is a striking alternation between rounding and Spreading of the lips.”2 This author also points out that because of the absence of labial articula- tion in the American Iroquois Indian language, it has been said that a Speaker can smoke his pipe while Speaking without producing any distortion in his Speech--without reducing his intelligibility. Such a situation raises the interesting question of whether it would be possible to lipread the Iroquois language at all. Lotz"3 also indicates that gesture activity accompanying speech is conventionalized to its respective language and differs from one area to another. These differences would also affect the lipreadability of various languages. u2John Lotz, "Linguistics: Symbols Make Man," Psycholinguistics; a Book of Readingg, ed. Sol Saporta (New York: Holt, Rinehart and Winston, Inc., 1961). u3Ibid. 22 The idea that certain words presented the same set of visual cues when Spoken as certain other words in the language was being discussed as early as the latter part of the nineteenth century. Alexander Bell used the term homophenous to include both homonyms and homophoneS--words that look alike on the lips and words that are pronounced alike, respectively.uu In 1903 Emma Snow published a long "5 list of homophenous words in the Association Review. Later Nitchie also published such a list‘.146 These people were teachers of lipreading, and their purpose in publishing such lists was the belief that those who want to learn lipreading must be aware of the possible confusion and misunderstanding resulting from the fact that two or more words look alike on the Speaker's lips. These writers and many others believed that the correct word of a homophenous group had to be distinguished on the basis of the context of the conversation in which it appeared. The same year that Snow published her list of homo- phenous words Davidson wrote an editorial in the Association Review expressing a divergent View.)47 He stated that uuDeland, op. cit. uSEmma Snow, "My List of Homophenous Words," Association Review, V (February, 1903), 29—A0, 119-131, 2Al-253. “6Edward Nitchie, "Homophenous Word Lists," Association Review, XVIII (July, 1916), 310-312. "78. G. Davidson, "Editorial: Homophenous Words," Association Review, V (1903), 92-93° 23 homophenous words are not exactly alike but that they are Similar and their appearnace on the lips is approximately the same. He believed, however, that homophenous words could be distinguished out of context. Thus it would seem that there have been controversy and disagreement regarding the phenomenon of so-called homophenous words since the concept was first introduced. From that time until recent years the majority of published material has supported the concept of homophenous words. Stowell, Samuelson, and Lehman in 1928 stated that approxiamtely 50% of the words in the English language have one or more words homophenous to them.“8 They suggest that as there are a number of different sounds that are revealed by the same movement, we have many words that differ widely in meaning but look exactly alike as seen on the lips. These authors believed that the only way homophenous words could be distinguished one from another is by the context in which they appear. Elizabeth Nitchie in 1930 stated that homophenous words were considered a valuable part of the training in A9 lipreading. She presented guidelines for building homophenous word lists which stated that the vowel must be u8AgneS Stowell, Estelle Samuelson and Ann Lehman, Lipreading for the Deafened Child (New York: The MacMillan Co. , 1928) . ugNitchie, New Lessons . . ., pp. cit. 2A the same for each word as there are no homOphenous vowel sounds, and that all sound movements must appear alike on the lips. She presented the following classification of homophenous consonants of English: . p.b.m.mb.mp f: V: ph: gh wh, w s, 2, soft c sh, zh, ch, j, and soft g t,d,n,nd,nt k, hard c, hard g, ng,nk,ck \lmU'llz'UUfUH 50 In another publication, Nitchie stressed that the sounds in certain groups of words have the same visible facial move- ments resulting in many homophenous words. Here it was suggested that "upwards of A0% of the sounds used in Speech "51 Samuelson52 have some other sounds homophenous to them. presented a paper before the Section on Otolaryngology of the Academy of Medicine in 1937 in which she also reported that A0% of the speech elements are homophenous. She pointed out how remarkable it is that people can learn to lipread as a substitute for, or supplement to hearing, deSpite such extreme handicaps. Goldstein, in discussing the problems of the deaf, also indicated that there are many words that appear alike to 50lbid. 51Elizabeth Nitchie, Lip—Reading Principles and Practices (New York: Fredrick A. Stokes, Co., 1930), 175-176. 52Estelle Samuelson, "Fundamentals of Lip-Reading, Including Demonstrations with the Audience as Subjects," Laryngoscope, XXXXVII (April, 1937), 237-238. 25 the eye of the speech reader.53 He, too, believed that these homophenous sounds and words, "could not be differentiated by the eye alone unless brought into association with other words of a phrase or sentence."5u Bunger, in presenting the Jena method of speech reading, stated that homophenous words and syllables would always be confused when they are Spoken alone and not heard. . . ."they must always be distinguished by the context."55 Wyatt56 also indicated that the right homophenous word is selected by means of its context. Bruhn who advocated the Muller—Walle method of lip- reading, estimated that about 50% of the words in the English language have some word or words homophenous to them.57 She used the word 'homophene' in her course of study to signify a word that has the same appearance with respect to the organs of speech as another word. Here again, She wrote, "they [homophenous words] must be distinguished by the thought or context of the sentence in d01158 which they are use Bruhn lists essentially the same homophenous consonant groups as did Nitchie. 53Goldstein, loc. cit. Sulbid. 55Anna Bunger, Speech Reading, Jena Method (Dansville, Illinois: The Interstate Press, 1932). 56 Wyatt, loc. cit. 57 58 Bruhn, The Volta Bureau, loc. cit. Ibid. 26 The concept that a person with normal hearing can detect by ear alone the differences between pairs of consonants, but that these differences are often not discernible by the eye is supported by Irene Ewing.59 She states that in visible speech certain consonants appear alike. In reporting a series of tests, Ewing found that every subject confused the pairs or groups of consonants that are called homOphenous. In connected speech the average number of mistakes due to confusion of consonants was negligible; but when the pairs or groups of consonants were presented as isolated sounds, the average number of mistakes was 65%. She believed that this indicated one of the main difficulties which must be met and overcome by the lipreader, that of homophenous words which can be distinguished only when they appear in context. Most tests of lipreading performance employ homophenous words to some extent. Haspiel6O constructed a lipreading test based almost exclusively on homophenous words. Ten questions are asked which incorporate homophenous words into the answer. Marie Mason constructed motion picture films for instruction in lipreading. In describing the preparation 59Irene Ewing, Lipreading and Hearing Aids, (Manchester: Manchester University Press, 1962). 60George Haspiel, A Sypthetic Approach to Lipreading (Magnolia, Massachusetts: Expression Co., 196A). 27 of these films, she stated that with at least some speakers, "a discrimination can be made between the visual movement for an unvoiced sound and a voiced sound, such as the "6]- In labial plosives in the words 'pat' and 'bat'. addition, the objectives that Mason stated for certain segments of her film seem to indicate that vowels modify consonant sounds in the immediate phonetic environment and that consonants have an effect on other surrounding consonants. The films are presented together with script content and objectives in O'Neill and Oyer.62 There Mason discusses homophenous words as those having "identical visible characteristics." Yet, the objectives given indicate the purpose of certain segments of film are "to familiarize the student with the varying visible characteristics of homOphenous sounds when they are preceded or followed by vowels of widely differing appearance."63 This would imply that the visible appearance of homophenous sounds changes as a function of the surrounding sounds. This disagrees with Samuelson, who stated in 1937 that there is no difference in visibility between voiced and unvoiced sounds.6u A series of several studies of lipreading stimulus materials has been done at the John Tracy Clinic in California. 61Marie Mason, "A Laboratory Method of Measuring Visual Hearing Ability," Volta Review, XXXIV (Oct., 1932), 510-516. 62 O'Neill and Oyer, op. cit., p. lA7-153. 63lbid. 6”Samuelson, loc. cit. 28 The basic orientation in these studies was an attempt to apply the principles of structural linguistics to the investigation of the linguistic determinants of perception and learning in oral-visual communication or, in other words, to the study of the visual perceptibility of English speech sounds. Sets of stimuli were prepared consisting of syllable pairs made up of consonant-vowel combinations which were filmed while Spoken by a female Speaker. Normal hearing subjects judged whether the stimulus pairs were the 65 same or different. Woodward reported in one of these studies that English consonants could be classified in the following homophenous clusters: 10 p-b-m A. ch-dz-sh-zh—y 2. f-v 5. t-d-n-l—s-z-th 3 wh-w-r 6. k-g—h It was reported by Woodward that lipreaders could distinguish between sounds in different clusters but could not distinguish between sounds within a given cluster, and that if lipreaders were to distinguish among the members of a set, it must be on the basis of phonetic, lexical, or grammatical redundancy because the articulatory differences among them are not noticeable in visual observation. In a later study using the same procedure of alike- different judgment of pairs, Woodward and Barber66 reported 65Woodward, Linguistic Methodology . . ., loc. cit. 66Mary F. Woodward and Carroll G. Barber, "Phoneme Perception in Lipreading," Journal of Speech and Hearing Research, III (September, 1960), pp. 212-22. 29 that of those dimensions which define articulatory differences in English speech, almost all--including resonance, articulatory type, voice, affrication and prolongation--are virtually neutralized as factors for visual perception except for the labial area of articulation. This study reported that there were only four visually distinctive units in English Speech: 1. p—b-m 2. wh-w-r 3. f-v A. all others These were categorized as bilabial, rounded labial, labiodental, and nonlabial respectively. Again, it was stated that while the units contrast visually, they are internally homophenous--they look alike to the lipreader. The authors indicate that it may be possible for a lipreader to discriminate between voiced and unvoiced sounds as in 'pill' and 'bill' when such words occur in sentences or phrases. They believe, however, that this discrimination is done on the basis of the context and not because of the voiced-unvoiced factor-—that is, it does not mean he can see visible articulatory differences among them. Lowell, Woodward, and Barber reported still another study in this series that again was based on a linguistic approach to the study of lipreading.67 The stated purpose 67Edgar Lowell, Mary Woodward and Carroll Barber, Education of the Aurally Handicapped: A Psycholinguistic Analysis of Visual Communicatiop, Coop. Res. Proj. No. 502, Univ. So. Calif., John Tracy Clinic (Los Angeles: 1960). 30 of this study was to develop a theoretical model of perception in lipreading, that is,a.definition of the units of visual perception of oral-aural stimuli and the rela— tionship of the visually perceived symbols to the under- lying linguistic system. The linguistic levels of analysis were: phonological: composed of phonetic, phonemic, syllabic, and morphophonemic; grammatical: composed of morphological, and syntactic; and lexical: composed of con- text and metaphorical extension. This experiment used a series of monosyllabic English nouns as stimuli. The results indicated that the 22 initial consonants of English appeared to fall into seven visually contrastive units rather than the four which were derived from earlier data. Another report by Woodward and Lowell68 distinguishes among articulatory homophenes of a lexical item as those words with which it might be confused. The authors suggest that the number of potential homophenes which are functional may be greatly reduced by the fact that some of these homophenous units will never occur in the same grammatical contexts and therefore not be confused. As an example, 'fib' is said to be visually equivalent to 'vim.' However, the 'vim' homophene is a noun and does not appear as a verb in 68Mary Woodward and Edgar Lowell, A Linguistic Approach to the Education of Aurally Handicapped ChildrenLCoop. Res. Proj. No. 907, Univ. So.Calif., John Tracy Clinic (Los Angeles: 196A). 31 English. Consequently, when 'fib' operates as a verb, it cannot be confused with 'vim,‘ therefore, it is distinctive. The authors again point out that consonants that involve lip movement are more visible than non—labial consonants. The research done by the John Tracy Clinic might be summarized as an attempt to discover the basic units of lipreading stimulus material through the principles of structural linguistics. Experiments were set up to test the visual discriminability of phoneme contrasts as indicated by perception of differences between pair-members of pairs of minimally distinctive nonsense syllables or monosyllabic words. Consonants which involve lip movement were found to be more perceptible than non-labial consonants. In terms of absolute visibility of phonation then, sounds such as /p/ and /b/ are easier to see than /t/; but /p/ and /b/ are said to be indistinguishable from each other on the basis of visual cues alone. Another investigation into the validity of the widely held concept of homophenous words was done within a linguistic frame work by Fisher.69 He studied three types of segmental phonemes and three types of suprasegmental phonemes for possible misidentification. Six speakers contributed equally to the presentation of 2A lists of nine groups of three words each. Each group of three words consisted of one- 69 Fisher, loc. cit. 32 and two-syllable words. The phonemes were initial and final consonant, vowel, syllable, stress, and juncture. Eighteen subjects viewed the stimulus material presented at a film Speed of 19 frames per second--slower than normal. They responded by selecting from a list of words, that word which represented the stimulus for the type of phoneme tested. The author coined the term 'viseme' (short from Visual phoneme) to mean a mutually exclusive and contrasting class of sounds visually perceived. The following visemes of initial consonants were found; (1) p, b,/m,d/, (2) /k,g/, (3) f, v, (A) w, hw /r/, and (5) all others. The visemes of final consonants presented as homophenous were; (1) p, b, (2) f, v, (3) /k,g,0 ,m/, (A)/, 3, d]'/tf/, and (5) all others. Those enclosed in diagonal lines showed significant but not reciprocal confusion. Two visemes of syllable length were found, one being words of two syllables or less, and the second, words of three syllables or more. Visemes of the stress phoneme were: (1) one-syllable words, and two-syllable words with stress on second syllable, (2) two- or three-syllable words with stress on first syllable, (3) three-syllable words with stress on second or third syllable. No visemes for the juncture variable were found. A study designed to investigate the ability of viewers to identify homophenous words correctly was done by Roback.7O 7ORoback, loc. cit. 33 Four Speakers were filmed while speaking a list of 75 homophenous words. The subjects were college students with no formal training in lipreading. The results of this study revealed that correct selection of homophenous words as seen on a Speaker's lips in a silent motion picture film occurs above that which is expected on the basis of chance alone. The author concluded that even though certain words may appear to be highly similar, it may not be accurate to say that homophenous words look exactly the same on the lips. The results of this study arouse curiosity with regard to the subtle differences that are perceived by Viewers that allow them to make distinctions among these so-called homophenous words that are at least highly similar in facial movements. Several studies on perceptive language in hearing defectives have been done in Japan. One of these, reported by Sato71 was directed to homophenous word groups from which confusion matrices were developed to measure the likelihood of correct identification of similar-appearing words. The author then developed a system of written symbols representing visible lip-teeth movements, each corresponding to one of the homophenous monosyllabic 718. Sato, "Some Experimental Studies on Perceptive Language in Hearing Defectives. Part I, Lipreading," I (Tohoku J. Ed. Psyl., 1963) cited in Deafness, Speech and Hearing Abstracts, IV (January, 196A), p. 58. 3A groups. Passages from a primary reader were transcribed into this system and presented to literate deaf adults. These subjects were able to translate most passages slowly but correctly. The author was deeply impressed with human capacity and verbal redundancy. The problems involved in learning to lipread are more apparent when we realize that spoken language is made up of a rapid succession of overlapping syllables that in turn are composed of some A0-odd sounds of varying visibility. Keaster72 states that only about 30% of the sounds of English Speech are Visible, whereas all the other sounds are hidden in the mouth or look like one or two other cognate sounds. Apparently, Keaster is grouping invisible sounds along with homophenous sounds in the 60% that she does not consider 'visible.' Samuelson73 demonstrated lipreading instruction using the audience as subjects as part of a presentation in which she reported that it takes 1/13 of a second to articulate a Speech element and that about 50% of speech elements are either obscure or invisible. This figure leaves 50% of the sounds Visible as opposed to Keaster's estimate of 30% of the sounds being Visible. A publication of the American Hearing Society in 19A3 suggested that both synthesis and intuition are called into 72Jacqueline Keaster, "An Inquiry into Current Concepts of Visual Speech Reception (Lipreading)," Laryngoscope LXV (January, 1955), pp. 80+8A. — 73Samuelson, loc. cit. 35 play to solve the problem of one movement representing more than one sound.7u This too, indicates that it is impossible to distinguish homophenous words on a visual basis alone. "However, in a sentence the factors of time, context, place, topic of conversation, etc., indicate the only word acceptable." Again, recognition of homophenous words is said to be done on the basis of other clues. The same paper presented derived visibility values of English sounds. A chart was developed which rated the visibility of each sound by giving it a value of l, .75, .5, or 0. A rating of 1 represented high visibility and the other numbers represented consecutive degrees of decreasing visiblity. The relative Visual intelligibility of the basic elements of the speech code was studied by Brannon and Kodman.75 They were interested in the variables that contributed to the visual identification of monosyllabic words. Comparisons were made in terms of Skilled and unskilled lipreaders. The rank order of intelligibility of sounds relative to phonetic class was found to be labio-dentals, labials, post-- dentals, lingua-dentals, velars, and glottals. Words composed of highly visible elements were identified correctly with greater frequency by both Skilled and unskilled groups. 7”American Hearing Society, New Aids and Materials for Lip Reading, (Washington, D. 0.: American Hearing Society, BM). 75 Brannon and Kodman, loc° cit. 36 The visibility of the total movement form afforded the best cue for visual identification of a word. The visual identification of words was directly related to place of articulation as well. Skilled lipreaders identified only 20% of individual words, and since this list was a representative sample of the Speech sounds of conversation, it was inferred that only 20% of the words of conversation can be identified. Therefore, according to the authors, about 80% of the speech information in lipreading must be supplied by contextual, situational, and other cues. One might question such an inference since the skilled lipreaders would be accustomed to viewing these words as spoken in conversational speech and not as isolated words. These words would undoubtedly present different facial movements when Spoken as isolated words than they would when Spoken as part of a sentence and thus under the influence of preceding and following movements associated with other sounds. Taafe and Wong investigated the ease or difficulty with which material could be lipread.76 The Iowa Film Test of Lip Reading was presented to a group of normal hearing college students. The material was examined in terms of sentence order, sentence length, number of words 76Taafe and Wong, loc. cit. 37 in a sentence, number of syllables in a sentence, and number of vowels and consonants. They also examined visibility of sounds and parts of speech and their relative influence on lipreading. Little difference in lipreadability was found between sentences of four, five, six, or seven words in length. An increase in number of syllables in a sentence, an increase in the number of vowels or consonants, or an increase in the vowel—consonant ratio, all contributed to an increased difficulty of the stimulus item. Words composed of three letters were easiest to lipread, and difficulty increased as the number of letters in a word varied to either side of three. Brannon presented three types of speech materials for visual identification to 65 high school and college students.77 The material was one form of the Utley Sentence Test of Lipreading, 50 phonetically balanced words selected on the basis of six categories of visibility related to the phonetic composition of the words, and ten spondee words selected in the same manner, five of which contained phonetic elements of low visibility and five containing elements of high visibility. The subjects identified about 50% of the words in the Utley Sentence Test, Form A; a mean percentage of about 35% of the PB words; and about 30% of the spondee words. Words containing consonantal elements of greater 77John Brannon, "Speech Reading of Various Speech Materials," J. Speech and Hearing Disorders, XXVI (1961), pp. 3A8-353. 38 visibility were more easily identified, however, the addition of one or two visible consonants did not simplify the identification process. An attempt was made by O'Neill to assess the relative contribution of lipreading in oral communication.78 Thirty— two normal hearing subjects listened to each of three Speakers under four noise conditions while viewing the speaker and under four noise conditions while not viewing the speaker. From this, the visibility of consonants, vowels, words and phrases was evaluated. He found that vision contributed 29.5% for vowels, 57% for consonants, 38.6% for words, and l7.A% for phrases. The visual recognition scores for the vowels and consonants were: /o/ 76%, /e/ 68%, /i/ 7A%, /u/ 6A%, /I/ 58%, fu/ 63%,/£/ 58%, /f/ 8A%, /p/ 80%, /s/ 86%, /t/ 71%, Ar/ 83%, /k/ 77%, and A>/ 75%. Vision contributed most to the recognition of consonants and had decreasing contribution to the recognition of vowels, words, and phrases, respectively. Based on these results, O'Neill states: if words are more visible than phrases, it is suggested that context in the sense of natural order of words is of no great help in the visual recognition of materials by inexperienced lip- readers. In fact, the additional words may have led to less recognition of materials--ineXperienced lipreaders complain of losing their place. 9 78O'Neill, "Contributions. . .," loc° cit. 79lbid. 39 The latter statement raises the question whether this may be because the inexperienced lipreader has not yet learned to view the 'whole'--is overly analytical in trying to perceive the material visually word by word. The contribution of visual cues to Speech intelligibility was also investigated by Sumby and Pollack.80 One purpose of this study was to examine the contribution of visual factors to oral speech intelligibility as a function of signal-to-noise ratio. They found that even though none of the subjects had formal lipreading training, visual perception was an important factor under severe noise conditions and the visual contribution to intelligibility increased as the signal—to-noise ratio decreased. The findings demonstrated rather conclusively that auditory and visual cues combined are superior to auditory cues alone. Another study designed to assess the effect of visual factors on the intelligibility of speech was done by Neely.81 He reported that the addition of visual cues to the auditory cues raised the intelligibility of received Speech by about 20%. Heider and Heider82 in an early study on lipreading stimulus materials developed two tests for investigating 80W. H. Sumby and I. Pollack, "Visual Contribution to Speech Intelligibility in Noise," J. Acoust. Soc. Amp, XXVI (195A), pp. 212-215. 81Neely, loc. cit. 82 Heider and Heider, loc. cit. no the comparative visibility of English sounds, one for vowels and one for consonants. The vowel test was made up of 16 syllables, and the consonant test was made up of A0 nonsense syllables, twenty with a diphthong and twenty with a vowel. General lipreading ability was measured by a word- sentence-story test. Eighty—one subjects viewed each of these tests, and the sounds were ranked in terms of the per cent of cases in which a sound was correctly recognized. A high correlation was found between ability to understand vowels on the lips and general lipreading ability. There was a much lower correlation between consonant recognition and general lipreading ability. Recognition of vowels was superior to consonant recognition. Finally, no correlation was found between lipreading of nonsense syllables and general lipreading ability as measured by the word-sentence-story 83 test. In an earlier study, Heider reported that studies with nonsense syllables showed that consonants are less likely to be mistaken than vowels. This discrepancy may possibly be due to the lack of relationship between lip- reading scores on nonsense syllables and general lipreading ability as measured by the test used by.these authors. 8A Numbers reported results similar to those of Heider and Heider in that it was found that pupils who score high 83F. Heider, "Report of Studies of Lip Reading," Annual Report of The Clarke School for the Deaf, LXIX (New York: Clark School for the Deaf, 1936), pp. 23-23. 8LAM. E. Numbers, "An Experiment in Lip Reading," Volta Review, XLI (1939), pp. 261-26A. ' A1 in recognizing single vowels also have a high score in recognizing meaningful material. In this experiment a lipreading test was given to an experimental and control group, each consisting of eight deaf children, after the experimental group had received 20 minutes per day practice in vowel recognition for six months. Similar tests found no correlation between consonant recognition and general lipreading ability. In an investigation of lipreading ability among normal hearing students, O'Neill reported that perception of the phoneme had the greatest effect on the identification of consonants and less on the recognition of vowels, words, 85 Simmons86 in discussing the factors related and phrases. to lipreading ability, also stated that the phoneme plays a role in comprehension of speech through lipreading. She draws attention, however, to the fact that, as seen in this review, investigators are not in agreement and some findings appear to be in direct conflict. The effect of selected aspects of stimulus materials upon lipreading performance were studied by Morris.87 She examined sentence length, sentence position within a group, 85John O'Neill, "An Exploratory Investigation of Lip- Reading Ability Among Normal Hearing Students," Speech Monographs, XVIII (1951), pp. 309-311. 86Audrey Simmons, "Factors Related to Lipreading," J. Speech and Hearing Research, 11 (December, 1959), 3A0-352. 87D. M. Morris, "A Study of Some of the Factors Involved in Lipreading," (unpublished Master's thesis, Smith College, 19AA), cited in O'Neill and Oyep, op. cit., AA-A5. A2 and the position of a group within a series of groups. A sample of deaf subjects viewed these stimulus materials in face-to-face testing. The results of this study indicated a definite decline in lipreading scores as sentence length increased. In addition, a word was more difficult to lipread when it occurred in a longer sentence than when it occurred in a shorter sentence. The position of a sentence within a group of sentences did not noticeably affect the lipreading score for a sentence, nor did the position of a group of sentences in a series of groups. Lowell has reported that knowledge of the structure of the English language seems to influence lipreading scores on a filmed test of lipreading.88 He suggests that parts of speech progress from least to most difficult in this order; pronouns, verbs, nouns, prepositions, adjectives, adverbs, and conjunctions. Questions are reportedly easier to lipread than are declarative sentences. One-and two—letter words are about as difficult as four- and five—letter words with longer words increasing in difficulty as their length increases. He also suggests that the best vowel-consonant ratio for successful lipreading is an equal number of vowels and consonants. One of the more widely known tests of lipreading ability was constructed by Utley.89 88Edgar Lowell, "New Insights into Lipreading," Rehabilitation Record, II (July-August, 1961), pp. 3-5. In discussing the 89Jean Utley, "A Test of Lipreading Ability," Journal of Speech Disorders, XI (19A6), pp. 109—116. A3 rationale for this test, Utley suggests that the skills of word, sentence, and story recognition by lipreading are interrelated. The combined skills do not, however, constitute a single unitary ability and therefore should be tested separately for diagnostic purposes. She found that the ability to lipread sentences is more reliably predicted from ability to lipread stories than from ability to lipread words, and that ability to lipread stories could be more reliably predicted from ability to lipread sentences than from ability to lipread words. Word lipreading ability was more reliably predicted from ability to lipread sentences than from ability to lipread stories. Since sentences were a more reliable predictor of both word and story lipreading ability, one would expect that sentences were the best stimuli to use. However, in another article, Utley concludes that stories are a better index of performance than are words or sentences.90 Moser, Oyer, O'Neill, and Gardner used a highly objective means of selecting monosyllabic words in terms of item difficulty and frequency of occurence in the language 91 for use in testing skill in visual recognition of words. They report that the use of monosyllabic words, in which the 90Jean Utley, "Factors Involved in the Teaching and Testing of Lipreading Ability Through the Use of Motion Pictures," Volta Bureau, 11L (19A6), 657-659. 91H. Moser, et al., Selection of Items for Testing Skill in Visual Reception of One-Syllable Words, Dept. of Speech, Ohio State Univ. Devel. fund no. 5818 (Columbus: Ohio State University, 1958). AA words are mouthed using neither whisper nor vocalization, was a reliable measure and correlated highly with a filmed lipreading presentation in which normal Speaking but no sound presentation to the lipreader is used. In another study designed to investigate the variables that contributed to the visual identification of monosyllabic words, Brannon and Kodman found a small but nonsignificant relationship between visual intelligibility and frequency of occurence of phonetically balanced words in the English language.92 In addition, the phonetic length of the one-syllable words did not play a significant role in the correct identification of words. 93 Neilson examined the effect of successive repetitions of a word on the visual recognition of the word. Forty-five words were selected from Voelkers one-hundred most frequently Spoken words and filmed while spoken by three male speakers. Each Speaker said the list of words five different times in different randomizations so that each word was said once in list 1, twice in list 2, etc., and each word said five times in list 5. One—hundred—fifty subjects unskilled in lipreading Viewed the film. The results indicated.that repetition of the stimulus item did 92Brannon and Kodman, loc. cit. 93Karen Neilsen, "The Effect of Redundancy on the Visual Recognition of Frequently employed Spoken Words," (unpublished Ph.D. thesis, Dept. of Speech, Michigan State University, 1966). 45 not produce significant improvement in visual recognition of the word. Another factor of interest within the code variable is that of the rate of the speech——the rate of transmission-— how rapidly the code units are presented. Byers and Lieberman filmed a young female Speaker, showing only head and Shoulders, speaking selected portions of the Utley 9A Sentence Test. Four groups of experienced lipreaders were divided into good and poor lipreader groups. Each of these groups viewed the film either at a normal rate of 120 words per minute, at two—thirds of that rate, at one-half that rate, or at one—third that rate. No significant differences were found among the four rates either in number of words correctly lipread or in quantity of words produced for either the good or poor lipreaders. Neither was there any interaction between lipreader ability level and rates. The authors conclude that the rate variable is not significant in lipreading performance for either good or poor lipreaders. These results pertain, of course, only to those rates examined. However, all the speaking rates investigated by these authors would appear to be below normal speaking rates. Black and Moore cite an average rate of college students Speech of 159.06 words-per-minute with a standard deviation guByers and Liberman, loc. cit. A6 of 23.6.95 This is a good deal more rapid than the rates employed by Byers and Lieberman whose most rapid rate was 120 words-per-minute, more than one standard deviation slower than the mean cited by Black and Moore. Thus, the Byers and Lieberman study still did not evaluate slow rates compared to normal, or fast rates compared to normal, but only Slower rates of transmission. Black, O'Reilly, and Peck,96 achieved the same results, however, as Byers and Lieberman using phrases composed of unrelated words rather than sentences as was used in the latter study. Pre-training scores, post—training scores, and speaker differences at normal projection speeds did not differ from a projection Speed reduced by 15%. Mulligan97 also reported that no significant differences in visual intelligibility were found between projection speeds of 16 and 2A frames per second. This section of the review of the literature has presented an overview of the research and other publications dealing with the code variable-—the stimulus material--in the lipreading process. There would appear to be some disagreement among the various writers as to what aspects 95 96John Black, P. O'Reilly, and L. Peck, "Self- Administered Training in Lipreading," J. Speech and Hearing Disorders, XXVIII (May, 1963), 183-186. 97 Black and Moore, op. cit., p. Al. Mulligan, op. cit. A7 of the code are easier to lipread than others; what exactly constitutes homophenous sounds; the influence of context on visual recognition of homophenous words and other words; and the influence of the phonetic environment on the facial movements associated with a given sound. There is much that remains to be known regarding this aspect of visual communication. Many aspects of the code variable have yet to be explored through a well-controlled experimental approach. Using the list of factors pertinent to the code variable in the lipreading process as outlined 98 it is clear that much remains that is unknown by Oyer, about the code in visual communication. Well controlled research is only beginning to scratch the surface, as it were, on most of these variables. Little is really known about the effects of redundancy and contextual influences, despite many statements made to this effect. Some research has been done and cited here, on stimuli groupings. Some research on Speed of presentation seems to indicate that this has little effect on visual intelligibility. Almost nothing has been done from the point of View of the amount of information carried by the stimulus units in visual communication. This overview seems to hear out what O'Neill and Oyer have said of the code variable in the lipreading process, that "this area seems to offer thegreatest possibility .for future research.."99 98Oyer, "An Experimental. . . .," loc. cit. 99O'Neill and Oyer, op. cit., p. A7. A8 The Speaker Those persons involved in the study of the process of visual communication and the instruction of lipreading have long been aware that there is a difference in the lip- readability of different speakers, and that different speakers present different sets of visual cues for a given set of stimulus units. As early as 1620 this was illustrated to some extent by Bonet, who believed that a student could learn to lipread his teacher but would be unable to transfer this training to be able to lipread other speakers.lOO More recently, this problem has been discussed by many writers. Mason suggests that a possible reason for the lack of objective tests of lipreading ability could be found in the existence of individual differences in the Visible manifesta— tionS exhibited by various speakers.lOl Montague, a deaf person herself, summarizes the lip— 102 She reader's dilemma regarding speaker characteristics. states that the lipreader does not watch lips alone--he watches the whole face and body of the Speaker. "Facial expression, gesture, movement, may all aid or hinder lip reading." She illustrates this when she states that it is 100Juan Pablo Bonet, The Method of Teachinngeaf Mutes to Speak, cited in Fred Deland, "Ponce de Leon and Bonet,"? Volta Review, XXII (1920), 39l-A2l. 101 Mason, "A Cinematographic. . .," loc. cit. 102Harriet Montague, "Lipreading——A Continuing Necessity," J. Speech and Hearing Disorders, VIII (September, 19A3), 257- 268. A9 more difficult for her to lipread a blind person than a seeing person because She could not see the expression in the speaker's eyes. Montague goes on to indicate that "persons with alive, mobile facial expressions can be understood visually much better than those who have cultivated, or were born with, poker faces." Oyer has Specified several areas of concern in studying the Speaker variable in the lipreading process. He states that logically we should look at the movement of the articulators, the amount of movement that takes place during speaking, and the rate at which movements take place.103 He also suggests that an important aspect to consider is collateral body movement--gesture activity—- as an aid to the transmission of the message. Some Speakers speak slowly with much movement, others speak rapidly with little movement, and we have all possible variations between these. In another publication on research in lipreading, Oyer suggests that facial characteristics of the speaker may be important, such as Size and movement of the lips, degree of exposure, and dimensionality--two-versus three-dimension viewing——and the associated shadow and movement effects.1014 103 u , " Oyer, An Experimental. . ., loc. cit. 10A Herbert Oyer, "The Present Status of Lip Reading," Auditory Rehabilitation in Adults, proceedings of a seminar, Cleveland, June 8-12, 196A, Cleveland Hearing and Speech Center-Western Reserve Univ. (Cleveland: The Seminar, 196A), 72—8u. 50 In summarizing some of the commonly accepted factors regarding the speaker, Lowell suggests that facial expressiomsaffect lipreading in that an unsmiling face is easier to lipread than a smiling face.105 That is, the plain set face is easier to lipread than one with an excessive amount of movement that is irrelevant to the message and actually distracts from the message. That is not to say that a face devoid of expression is easier to lipread. Fusfield bears this out when he states that a speaker with dynamic animation and personality provides an encouraging backdrop for speech-reading. In contrast, he goes on, "the cold, mechanical type of Speaker, even though precise, is a handicapping factor for the lipreader."106 Exaggerated mouthing is said to be another feature making lipreading more difficult. Fusfield suggests that, "the speaker's bearing, character of lip movement, pronunciation, facial features, sex, fullness of lips, Size of mouth, and chinv and jaw movements, all affect the lipreadability of the speaker."107 These latter variables are all quite in agreement with the general statement of research topics outlined by Oyer earlier in this section. 105Lowell, "New Insights. . .," loc. cit. 106Fusfield, loc. cit. 107lbid. 51 Silverman, Lane, and Doehring suggest that the lack of uniformity among speakers in lipreadability is due to such factors as variety in precision of articulation, flexibility of lip movement, and in mobility of facial expression.108 As part of a study previously reviewed here, Brannon and Kodman investigated the relationship of visual 109 A small but intelligibility to vertical mouth opening. nonsignificant correlation was shown in a comparison of these two factors. The authors concluded that size of vertical mouth opening did not play a significant role in the visual identification of monosyllabic words. Different speakers vary in the amount of movement that is visible on the face when they speak, in the kind of expression on their face, in the amount of lip movement and mobility,and in other factors. AS a result, speakers vary in the ease or difficulty with which they can be lipread. Black, O'Reilly, and PeckllO in a study previously cited, found as part of their results that all speakers are not uniformly lipreadable. The study dealing with selection of monosyllabic words for testing lipreading reported by Moser, Oyer, O'Neill, and Gardner found significant interspeaker 1088. Silverman, H. Lane, and D. Doehring, "Deaf Children," Hearing and Deafness, ed. S. Silverman and H. Davis (New York: Holt, Rinehart and Winston, Inc., 1960). 109Brannon and Kodman, loc. cit. llOBlack, O'Reilly and Peck, loc. cit. 52 differences beyond the .01 level of confidence.111 As part of his research dealing with the contribution of visual symbols to speech comprehension, O'Neill investigated differences among speakers in terms of their ability to convey information auditorily or visually.112 He,too, found wide interspeaker differences. He indicated that the speaker who conveyed the most information by visual means (lipreading) was also the most intelligible under non—visual conditions. The Roback study also found Significant differences in the degree to which different speakers could be lipread.113 A followup to this study was done by Joergenson, who did a frame-by-frame analysis of a silent motion picture film of four Speakers each saying 12 groups of four homophenous words each.llu It was hoped that such an analysis of facial movements associated with the production of homophenous words would yield information as to the subtle differences that are available to viewers. It was found that there appeared to be visible differences in mouth opening during the utterance of homophenous words; there were minute differences in mouth widths; there were no significant differences in time required to-say homophenous words or lllMoser, et al., loc. cit. 112O'Neill, "Contributions...," loc. cit. 113Roback, loc. cit. ll”Joergenson, loc. cit. 53 in the time that the teeth were visible. However, an interesting factor was that there was a variation in the temporal pattern of lip movement during production of homophenous words. In other words, maximum lip movement occured at earlier or later temporal intervals for different homophenous words. An extensive study of the effects of facial characteristics upon lipreading was done by Stone.115 He examined facial exposure, facial expression, and lip mobility. Colored motion picture films of a trained actor were viewed by normal hearing subjects. The results indicated that a normal lip movement produced better lip- reading performance than did a tight lip movement. Secondly, a plainly set facial expression was easier to lipread than a smiling expression. The degree of facial exposure was significant to lipreading performance only when considered along with the other two variables, however, the author indicated that full torso exposure was usually preferable to limited mouth exposure. Lip mobility had the most pronounced and consistent effect on the success of lipreading of the variables tested. Facial expression was second in importance. These results are in general agreement with those cited earlier by Lowell and Fusfield regarding expression and mobility as important factors in the 115Stone, loc. cit. 5A lipreadability of Speakers and are aimed quite directly at certain of the variables outlined by Oyer. In another project, lateral and frontal photographs and lateral X-rays were taken of five subjects while they produced 12 vowels. Stone casts of the lips were also made using dental impression material. This work was done by Fromkin in an attempt to measure lip positions for the vowels.116 Measurements were made of width of mouth opening, height of mouth opening, area of lip opening, distance between outermost parts of the lips, protrusion of upper and lower lip, and distance between upper and lower front teeth. It was found that lip positions serve to distinguish sets of vowels--front unrounded from back rounded vowels—- but play little role in distinguishing vowels within a group. Lip protrusion was found to occur principally in the lower lip with protrusion appearing up to 5 millimeters. A study designed to assess more fully the temporal factor in visual recognition of phonemes was done by Oyer and Nelson.117 Movies were made of a Speaker saying each of the vowels, diphthongs and consonants in isolation. The frames Showing onset to termination of each sound were shown to subjects one at a time. AS the subjects viewed each frame, they attempted to recognize the phoneme being 116Victoria Fromkin, "Lip Positions in American English Vowels," Language and Speech, VII (Oct.-Dec., 196A), 215-225. 117Herbert J. Oyer and Max Nelson, Assessment of the Temppral Factor in the Visual Recognition of Sounds, Paper presented at Am. Speech and Hearing Ass'n annual meeting, Chicago, November 6, 1961. 55 said. The results indicated that recognition occurred differentially among the sounds, recognition times for homophenous sounds were similar, and recognition of sounds occurred frequently after only 50% exposure of the sound. A question might be raised here regarding the similarity of recognition times for homophenous sounds in that these were all produced in isolation, and one is led to wonder what effect or changes may occur had this been done in syllables so that the sounds would have the natural influence of surrounding sounds. There is some evidence that the phonetic environment has an effect on the visual cues associated with a given sound. The homophenous sounds- tested in the above study may have shown dissimilar reaction times under these conditions. Still another factor of interest within the speaker variable in the study of the lipreading process has to do with the angle with which the receiver (lipreader) views the speaker. In many Situations this would be an important factor to consider. Woodward, Barber, and Lowell studied this aspect as part of the continuing research being done at the John Tracy Clinic in the lipreading process and cited previously in this paper.118 They found that a full—face view of the Speaker and what they called a profile view (which was actually a forty-five degree angle) were equally good for lipreading purposes. 118Woodward, Barber and Lowell, loc. cit. 56 Somewhat different results were obtained by Neely in a study of the effect of visual factors on speech intelligibility.119 Using multiple choice intelligibility tests, he had each of 35 subjects listen to two lists with the Speech masked by 100 dB of white noise. Each listener sat at eleven test positions: at three, Six and nine feet from the Speaker at angles of ninety, forty—five and zero degrees, and facing away from the speaker. The three distances did not result in significant differences in intelligibility scores. A Significant difference in intelligibility scores was found relative to the angles at which the observer sat with respect to the speaker. Mean intelligibility scores across all distances were 58.7% at 90 degrees; 61.7% at forty-five degrees; and 6A% at 0 degrees. Some research has been done on the use of television in lipreading instruction. Much of this research has involved, necessarily, aSpectS of the speaker variable. 120 Larr discussed the use of closed circuit television for speech-reading training. The relative degree of difficulty imposed by different angles from which the speaker image was viewed was studied from a front view, a forty—five degree angle, and a profile (ninety degree angle) view. The results of this study indicated that the front View and the forty-five degree angle were somewhat easier than the 119Neely, loc. cit. 120Larr, loc. cit. 57 profile View. The forty-five degree angle was Slightly superior with a score of 61.5%; the front view yielded a score of 58.9%; and the profile view a score of A3.3% correct recognition. One notices discrepancies among these last three studies cited regarding viewing angle. There seems to be a tendency in favor of the forty-five degree angle view of the speaker with little difference between that angle and a full front View. One can logically expect some advantage from a forty-five degree angle view in that this might allow the lipreader to notice such factors as-lip protrusion more readily than would be possible with a front View and yet not totally lose the front view advantage of seeing the entire face for expression, lip rounding, etc., which would be partially lost from a profile view of the speaker. The Larr study also examined image Size as a variable in the use of television for lipreading purposes. The speaker was Shown on the screen in four different Size images: upper torso, head and neck, head only, lips only. The highest score was registered for the head and neck image while upper torso image scores were nearly identical (66.3% and 66.2% respectively). Head only image was considerably more difficult with a score of A7%.* With the lips only image, understanding of speech was very difficult yielding a score of 36% correct. When improvement in lipreading scores over five weekly meetings was used as a criterion, head and neck 58 were again superior with 55%, the lips only image resulted in A3% correct, upper torso produced 29%, and head only image yielded 19% improvement in lipreading scores. Apparently the training period improved viewers' ability to understand Speech from the lips only image while some other image sizes did not improve so well. It is curious that the head and neck image maintained superiority both in improvement and in initial scores. The lips only image was second in improvement while head only was lowest in improvement. The improvement in the lips only image is somewhat understandable because of its low starting point relative to the other images so that any improvement appears to be greater, but this does not explain the lack of improvement for the head image. In discussing image Size to be used in television production for lipreading, Smith indicated that it was important to keep the lips and facial expression clearly visible.121 He indicates that extreme closeups were rejected because the lipreader must never look at the mouth only, but at the entire face while concentrating on the mouth. Very little variation from a head and shoulders shot was recommended because it was believed that the filling of the screen with face from forehead to chin was un-natural and a waist-to-head shot would make the face and lips too small to be perceived clearly in a normal television set. 121Smith, loc. cit. 59 Here we find a logically derived approach which follows quite closely the experimental findings reported earlier. Another study of relevance to the use of television in lipreading instruction was reported by Oyer,122 in which normal hearing students in a lipreading class served as subjects meeting five days per week. An attempt was made to determine whether significant improvement in lipreading test scores would be obtained when lipreading lessons were presented by way of closed circuit television. After a ten-week period of such instruction, it was concluded that lipreading can be taught by means of television. The author cautioned, however; thatthe results in such a two- dimensional setting could not be generalized to a face-to— face three-dimensional Situation. This section has reviewed the majority of the published work pertaining to the speaker as a variable in the study of the lipreading process. It appears to be commonly accepted that there are wide variations in the lipreadability of different speakers. The work that has been done seems to indicate that much of this variability in lipreadability of Speakers is due to differences in facial expression and mobility and flexibility of the speaker's articulators. This bears out the commonality of certain factors to both the visual and acoustic aSpects of oral.communication in 122Oyer, "Teaching. . .," loc. cit. 60 that the most intelligible Speaker by auditory means appears to be the most intelligible speaker through Visual means. Much of the research on this variable has again been of a rather subjective nature, using viewers' judgments as stimulus responses. More recently some work has been done that attempts to achieve greater objectivity through the use of motion picture films of Speakers and measuring the actual facial movements of the speaker in a frame-by-frame analysis. A still more objective approach is needed in examining the facial movements of the Speaker and the differences between speakers that produce the variations in lipreadability that are known to exist. Much remains to be known about the Specific movements or lack of movements, facial characteristics, etc. that produce these differences in Speakers. Facial Movements The majority of the reported research on the lipreading process has dealt with the lipreader, examining such factors as intelligence, perceptual skills, educational achievement, personal adjustment, and the relationship between these factors and lipreading performance. One such study was performed and reported by O'Neill and Davidson.123 Thirty normal hearing subjects viewed a filmed lipreading test, and the results of that test were examined for a 123O'Neill and Davidson, loc. cit. 61 relationship between those results and scores on four other tests. No significant relationship was found between lipreading performance and level of aspiration, intelligence, reading comprehension, or digit memory span. However, there was a significant relationship between lipreading performance and non-verbal concept formation. The authors conclude from this that "it may be well to include training in the recognition of Simple forms or lip configurations along with training in a regular method of lipreading."l2u With regard to the present study, the above statement has special partinence. Two questions then need to be examined, however. First,are there minute differences in lip configurations among the various so-called homophenous words, and second, is the visual reception system capable of noticing and utilizing such simple forms or minute lip configurations as may be present among the various so- called homophenous words in order to distinguish among those words? In answer to the second question, a descriptive 125 suggests that the eye is article written by Jacoby capable of recognizing visible phenomenon as discriminately as the ear can recognize auditory phenomena. She indicates that the eye can apprehend differences that are as minute l2ulbid. 125Beatrice Jacoby, "Lipservice to Lipreading," Hearing News, XXVII (September, 1959), p. 18. 62 as those that can be noticed by the ear. However, "the eye must be directed to the significant visible elements to teach fine discrimination."126 Harris also supports this view. He has stated that the ear compares favorably to the eye in ability to detect minute amplitudes and slight amounts of energy. In both organs, which are roughly similar in terms of energy at threshold where they are the most efficient, "sensitivity is almost at theoretical limits."127 This would appear to answer those who have suggested that the eye was not sufficiently sensitive to operate as efficiently as the ear as a receiver in a communication system. There seems to be ample evidence that the visual system is indeed capable of performing as well as does the auditory system as a receptive channel for speech. What is needed is to specify what the significant elements are in terms of distinctiveness of lip or facial configurations and movements. In discussing the use of programmed training in lipreading, Brehman cites research to indicate that training in compound stimuli as done in normal lipreading training leads primarily to learning of the more easily 128 discriminated dues--the more obvious movements. This Ibid. 127J. Donald Harris, Some Relations Between Vision and Audition (Springfield, 111.: Charles C. Thomas, 1950 , A5. 128George E. Brehman, "Programmed Discrimination Training for Lipreaders," Am. Annals of the Deaf, CX (November, 1965), 553-562. 63 suggests the need to isolate the hard-to-discriminate cueS--isolated jaw, tongue, lip, and facial movements-- for training purposes. The author believed that the identification of such less obvious cues Should be followed by verbal labels for those cues so that they can be taught to lipreading students. It is suggested that sub-phoneme stimulus elements need to be identified and labeled as visual cues for purposes of teaching those elements to the lipreader. With regard to the first question stated pertaining to the existence of such hard-to-discriminate cues among so-called homophenous words, the review of the literature on the subject of homophenous words previously given in this paper has shown that there is disagreement on this point. Most of the earlier work held fast to the idea that homophenous words could not be discriminated on the basis of visual cues alone. The more recent research tends to indicate a need for a new classification of what words or sounds are really homophenous at the least, with disagree- ment as to what the classification Should be. Other research has gone a little farther and indicates that viewers are able to discriminate among homophenous words beyond chance expectation. There would appear to be sufficient evidence to indicate that there are differences in visual cues among the so-called homophenous words that have not yet been discovered. 6A This problem can also be approached from a theoretical point of View. Each of the various sounds of the English language is produced by the articulatory organs, and the differences between these sounds are accomplished by differences in the relative position-or movement of these organs with respect to each other. Logically, it would be expected that a different position or movement of the articulators would not only produce a different auditory cue, but also that change should be expected to Show up in a change in the facial configuration--a different Visual cue. If more pressure is required for a voiceless sound than its voiced couterpart as research has indicated, one would expect this pressure increase to be reflected in additional jprotrusion of the lips or cheeks, for example. If one sound differs from another sound in the position of the articulators that make these sounds, that difference should, in most cases, be apparent visually on the surface of the face as well as auditorily. Black has examined the amount of air pressure present during the production of consonant sounds.129 He found that the voiceless continuants had greater amounts of air pressure than the other types of consonants. From this, he tentatively suggests that pressure differences may 129John Black, "The Pressure Component in the Produc- tion of Consonants," J. Speech and Hearing Disorders, XV (1950), 207—210. 65 assist in the visual identification of some consonants. He also found that the consonant was accompanied by diminishing pressure as it receded in a word. Final consonants were spoken with less pressure than initial consonants. Such pressure differences could well be expected to assist the lipreader in distinguishing between certain consonants by producing less protrusion of the lips, possibly less bulging of the cheeks, when those consonants are Spoken that have lesser amounts of air pressure. Such small differences may help distinguish between /s/ and /z/, /f/ and [3/--voiceless versus voiced consonants--and even more so as the sounds appear in the final position in words. Other research from the area of speech science has pertinence to this discussion as well. A study reported by Isshiki and Ringel examined air flow rate during the 130 production of certain consonants. Four male and four female speakers read a A0 item list of 20 CV and 20 VC syllables while air flow rate was recorded. It was found that the rate of air flow was greater for voiceless consonants than for voiced consonants. Different flow rate patterns were found to exist for the various consonant sound groups, Specifically the rate for stops, fricatives, l3ONobuhiko Isshiki and R. Ringel, "Air Flow During the Production of Selected Consonants," J. Speech and Hearing Research, VII (September, 196A), 233-2AA. 66 and vowel—like sounds decreased in that order. Finally, there was more variability in flow rate for consonants in the final position than in the initial position. Such differences in rate of air flow would be expected to be accompanied by physical differences on the Speaker's face and/or neck regions which could be visually perceived. In addition, both the Black study and the Isshiki and Ringel research found evidence of differences in air pressure and flow rate-when a consonant is in the final position as opposed to initial position. Such differences could likely provide additional information to the lipreader as to the termination of one word and thus, the onset of the next word, assisting in visual recognition of words in context. Some evidence to the contrary has been reported by O'Neill.131 AS part of his larger study on the contribution of visual components of Speech to intelligibility, O'Neill reported-that the sound pressure levels of the particular vowels and consonants studied did not play an important role in their visual recognition. He did not, however, examine all sounds of the English language and, in fact, omitted some of those which were included by Black and by Isshiki and Ringel and found to produce differences in air pressure and flow rate. 131O'Neill, "Contributions. . .," loc. cit. 67 The movements of the lips when bilabial stops and nasals are produced in various phonetic environments were photographed by means of a stroboscopic technique and 132 It was found that the effect of reported by Fujimura. the environment of the consonant upon the initial Speed of the lip opening is considerable. *The movement was particularly rapid when a tense bilabial stop consonant is in the initial position of a word. There was a significant difference in the physical mechanism of the motion of the lips during the production of the nasal bilabial, compared to that of the stops. The opening at midsagital measure of the lips at the first five milliseconds was significantly larger for initial /p/ than for initial /b/ or /m/. With respect to area of mouth opening, an abrupt change in speed of opening took place and was very apparent in /p/ and /b/ but not in /m/. Here there is evidence of further information to help the lipreader perceive the difference between /p/ and /b/ and /m/, commonly thought of as homophenous sounds. The physical mechanism of lip movements for the /m/ was different from that of the /p/ and /b/; the mouth opening was larger during the first five milliseconds for /p/ than for the other two sounds; and a change in mouth opening area for /p/ and /b/ but not in /m/ provides another differentiation. These 132Osamu Fujimura, "Bilabial Stop Consonants: A Motion Picture Study and it's Acoustical Implications," J. Speech and Hearinngesearch, IV (September, 1961), 233-2A7. 68 may be very minute differences, but if they can be detected and identified, it is possible that they can be taught to the student in lipreading in order to help discriminate between these so-called homophenous sounds. Again, the differences may be minute, but so are the auditory differences between many of the sounds that we learn to discriminate normally. On a more subjective basis, Mason has also indicated her belief that certain sounds are influenced by the 133 sounds that precede or follow them. In presenting the films that she constructed, she lists objectives designed to alert the lipreading student to the changes in a given sound that are produced by surrounding sounds. Wong and Fillmore studied the effect various vowels have on consonant sounds in the immediate phonetic 1314 environment. They suggest that vowel duration is a primary cue for auditory differentiation of similar word pairs and for such pairs as 'his'-'hiss,' when the final consonant is unvoiced as opposed to voiced. Logically, if duration contributed to auditory recognition of words, that changed duration of auditory signal should be accompanied by a changed duration of the visual signal as well, again contributing to visual intelligibility. 133Marie Mason, "Visual Hearing Films," cited in O'Neill and Dyer, op. cit., p. 1A7-153. lBqu. Wong and C. J. Fillmore, "Intrinsic Cues and Consonant Perception," J. Speech and Hearing Research, IV (June, 1961), 130-136. 69 Summers examined oral and nasal sound pressure levels of speakers at certain intensity levels.135 Speaker subjects, 16 male and 1A female speakers, produced eight vowel sounds at four intensity levels. He found the oral sound pressure levels across all sounds and intensities, were lower for females than for males. The converse was true for nasal sound pressure levels. Differences in sound pressure levels were also found among the vowel sounds across both oral and nasal locations.‘ Here again we see differences as a function of Speaker sex on an acoustic basis that might be expected to produce some differences on a visual basis as well. As part of a larger study, Guttman reported that he found differences in sound pressure levels between male and female speakers to be small but that duration was significantly longer for the female group and word rate was Significantly Slower for the female group than for the male group.136 Such differences in physical parameters could well be expected to influence the Visual parameters. Under normal conditions a person who speaks more Slowly is also easier to lipread.. Here is one more factor of importance- to consider in the construction of a lipreading test, this difference in duration and word rate between male and 135Raymond Summers, "The Nasal Sound Pressure Levels of Vowels Produced at Specified Intensities," (unpublished Ph.D. thesis, Dept. of-Speech, Purdue University, 1955). 136Newman Guttman, "Experimental Studies of the Speech Control System," (unpublished Ph.D. thesis, Dept. of Speech, University of Illinois, 195A). 70 female Speakers, which could have an influence on the validity and reliability of a lipreading test. This section has presented a discussion of facial movements as they occur during the production of Speech. The logical expectation of differences in facial movements among the so-called homophenous sounds based on well controlled research from the area of speech science has been presented. This has been discussed relative to some of the research on lipreading that supports the notion of the lipreader's ability to differentiate among words composed of these homophenous sounds. The need to be able to isolate, identify, and label the minute differences in facial move- ments among so-called homophenous words has been stressed, so that they might more easily be taught to the student of lipreading. Summary This chapter has reviewed the literature pertaining to the stimulus material in the lipreading process, and more specifically, to the phenomenon of so—called homophenous words within the code variable as a factor in lipreading research. The effects of redundancy, types of stimulus units, stimulus unit difficulty and Similarity, stimulus groupings, rate of presentation of the units, and other related factors have been discussed and pertinent research from the literature presented. 71 Much of the published research and other material regarding the speaker as a variable in lipreading research has been surveyed and discussed. Many writers in the field have presented the notion of wide interspeaker differences in terms of lipreadability and it has been quite commonly accepted. The research cited in this review tends to bear out this hypothesis with some suggestions as to the cause of this variability. The attributes of Speakers that contribute to this variability and the effects of viewing angle and distance upon lipreadability have been discussed. The need for well-controlled experimental research on both code and Speaker variables in the lipreading process has been stressed and supported by many writers. The diffi- culties involved in performing such research has been discussed, and some of the reasons for the lack of such objective approaches to the problem of lipreading have been presented. The need for adequate instrumentation or the adaptation of presently available instrumentation to the problem of-the lipreading process has been emphasized. Finally, the logical expectation of visually perceptible movements accompanying the physical and acoustic differences Shown to be present among many of the so-called homophenous sounds has been presented as a possible source of minute differences in facial movements that could be utilized by the lipreader to differentiate between those homophenous sounds on the basis of visual cues alone. CHAPTER III SUBJECTS, EQUIPMENT AND PROCEDURES Subjects.-—A group of ten subjects participated in this project,five males and five females. These subjects were selected from the graduate student population of the Department of Speech at Michigan State University. Subjects in the male group ranged in age from 2A to 33 Years with a medium age of 28 years. The female group ranged from 2A to 3A years of age with a median age of 29 years. It was assumed that the graduate students of this department were sufficiently representative of an educated young adult population to be considered a random sample of such a population. Equipment.--The following equipment and apparatus were used in this investigation. Polygraph (Grass, Model 5-D) Low-level D.C. Preamplifier (Grass, Model 5P1K) D.C. Driver Amplifier (Grass, Model, 5E) Ink Writing Oscillograph (Grass, Model 5DWC) Recording Chart Paper (Grass, type G25-A") Plethysmograph (Parks Electronics Lab., Model 270) Mercury—rubber Strain Gauges (Parks Electronics Lab., .015" x .0A0", 1A inch length) Aerosol Adherent (Becton, Dickinson, Ace Adherent) Surgical tape, plastic Polar Planimeter (Ott, type 3A) Stimulus Materia1.--Three clusters of homophenous consonants were chosen for study. These included /p/, /b/ and /m/; /t/, /d/ and /n/; /tf/, /df/ and /f/. Lists of 72 73 six homophenous words were constructed for each cluster such that all sounds in each word within a group of a given cluster remained constant across the homophenous consonants of that cluster, with the exception of the change in the consonant. Thus, the first group of the /p/, /b/, /m/ cluster included "pad, bad, mad," with the medial and final phonemes remaining constant and the only change being in the initial consonant. The list of words for each cluster contained three words in which the homophenous consonant was in the initial position and three words with that consonant in the final position. Thus, each homophenous cluster contained eighteen words-- nine initial position shift and nine final position shift. This yielded a total of 5A monosyllabic words. This last of 5A words was randomized to prevent any undue effect of word order in the study. Ten individual randomizations were prepared so that each subject appearing in the project read a separate randomization of the list of stimulus items. The list of stimulus words used in this study is presented in Table l in their respective homophenous clusters. Pilot Study.--It was realized at the outset that much pilot work needed to be done in order to develop a reliable experimental method. With this in mind, several experimental sessions were conducted prior to the actual performing of the investigation. The results of this pilot work soon indicated that two utterances of each word were insufficient in that one could Table l.--Homophenous Word List. 7A /P/ /b/ /m/ rope robe roam rip rib rim cup cub come pad bad mad pet bet met pie buy my /t/ /d/ /n/ coat code cone moot mood moon but bud bun tame dame name tick dick nick tot dot not /tf/ /d}/ /7/ match madge mash leech liege leash march marge marsh chin ' gin Shin cheap jeep sheep chew. jew Shoe not be sure that these results were representative of a normal utterance, and that there was little consistency among the two repetitions. The first utterance was often marred by preparatory actions such as the intake of breath, coughing, clearing of the throat, etc. on the part of the subject. The use of many repetitions proved to induce fatigue and boredom in the subject, resulting in what appeared to be 75 rather artificial and stereotyped responses. Finally, a sequence of five repetitions of each stimulus item was selected and found to produce satisfactory results. There appeared to be a motivating factor for the subjects in knowing that they would be uttering the word just five times, thus being somewhat more cooperative. The duration of the interval between each utterance of a stimulus item was also examined. It was found that this interval had to be varied to discourage a temporal patterning of subject responses. A minimum interval of two seconds was found to be necessary for best results. Initially, a relatively quiet electric buzzer and a signal light were attempted as a means to signal the subject to produce the next utterance. However, neither of these proved to be satisfactory. Each tended to cause extraneous movement to occur. A quiet verbal signal was attempted and found to produce the most satisfactory results. This also allowed greater flexibility in terms of varying the interval between repetitions and maintaining the minimum interval between repetitions. A significant part of this pilot work entailed repeated applications of the strain gauge to the same subject as a check on the reliability of gauge application over repeated trials. The gauge was attached to two different subjects on three separate trials on one day and again on the following day. The subject Spoke the same group cd“words on 76 each trial. The resulting graphic tracings of movements occuring on each trial were found to be very Similar in general configuration, amplitude, and on the Six measures to be used in the actual investigation (these measures will be described in detail in a later section). This indicated that there was good reliability in terms of application of the strain gauge. The results of the pilot work led to the experimental procedures as outlined in the following section and used in the actual study. Experimental Procedures.—-In the performance of this study, each subject appeared individually and was allowed to familiarize himself with the list of words prior to the actual investigation. Any questions regardinggpronunciation of the words was clarified at the time. The subject was seated so that the recording apparatus was out of his field of vision. Visual and auditory distractions were kept to a minimum. The aerosol adherent was gently sprayed onto the area surrounding the lips, on the nose, and on the chin. This material enables a more secure bond with the plastic surgical tape. The mercury-rubber strain gauge was then attached to the subject's face while the facial muscles were at rest and relaxed with his mouth closed. The gauge was then attached to the subject's face at eight points. It was attached to the left corner of the lower lip, followed the vermillion border of the lower lip 77 to the middle of that lip, attached to the face, and on to the right corner of the lower lip and again secured to the skin. It was then secured to the right corner of the upper lip, drawn along the vermillion border to the juncture of the lip with the columella, and to the left corner of the upper lip and attached as before. From here the gauge was brought loosely to the nose, attached there, and drawn to the chin for the final attachment. Each attachment was made with plastic surgical tape approximately .25" x .50" in Size. A small metal probe was used to apply pressure to the tape to insure contact of the tape at all points around the gauge and to the skin. Between each point of attachment the gauge was stretched to ten percent of its unstretched length as recommended by the manufacturer. This was done by measuring the distance from each attachment to the next point of attachment on the skin with a metric measure. This length was marked on the unstretched gauge. The gauge was then stretched so that this mark was beyond the point of attachment on the skin by a factor of ten percent of the unstretched length. At that point the gauge was attached to the face. See Figure 1 for an illustration of the placement of the strain gauge. The mercury-rubber strain gauge is a length of highly elastic tubing (silastic) filled with mercury. Contact is made to the ends of the mercury column by means of wires inserted into the ends of the tubing. As the tubing is 78 ”D )7? Figure 1.—-Diagram of Strain Gauge Placement. 79 stretched, the enclosed mercury column is lengthened and narrowed, increasing its electrical resistance. This variable resistance is arranged to form one side of a Wheatstone bridge which is coupled to a circuit that allows amplification of the resistance changes. The resistance increases linearly with length when the length changes are small compared to the unstretched length, and the gauge is designed to be used under tension. The electrical resistance of the gauge is very low--approximately two ohms for the gauges used in this experiment.137 These gauges, along with the plethysmograph, have been used in clinical medicine as a diagnostic tool, in preoperative evaluation, in operative monitoring, and other such uses and reported by Gibbons, Strandness, and Bell.138 The leads from each end of the strain gauge were then connected to the '1ong' and 'common' poles of the strain gauge input of the plethysmograph. The plethysmograph is often used to record small changes in the volume of digits, limbs, etc. It allows for two methods of detecting volume changes: (1) the impedance method using hypodermic or surface electrodes to detect the electrical impedance (resistance) of the object under study; (2) the circumference 137Loren'ParkS', A Versatile Plethysmograph for Research-— Model 270 (Beaverton, Oregon: Parks Electronics Lab., 1966). 138G. Gibbons, D. Strandness, and J. Bell, "Improvements in Design of the.Mercury Strain Gauge Plethysmograph," Surgery, gynecology, and Obstetrics, CXVI (1963), pp. 679-682. 80 method used in the present investigation which uses the mercury-rubber strain gauge as described above.l39 The D.<3. output of the plethysmograph was fed into the low-level D.C. pre-amplifier of the polygraph. This in turn was connected to the driver amplifier which amplified the signal to the oscillograph. The pro—amplifier was. adjusted to an input impedance of 20K with a sensitivity setting of 20 millivolts per centimeter. The ink writing oscillograph was adjusted to a paper speed of 25 millimeters per second. The baseline of the tracing was adjusted to - the same point for each subject. After the strain gauge was in position, a period of five.minutes was allowed for the subject to adapt to speaking with the gauge in place. The strain gauge is highly elastic and presents little physical resistance to movement. However, there was obviously some degree of unusual sensa- tion present by having this material attached to the face and a period of adaptation was found to be helpful. The . following set of instructions was then read to the subject: You are about the participate in a study to measure‘ objectively differences in certain facial movements during the production of monosyllabic words. You have already had an opportunity to familiarize your- self with those words. You are to say each word in the list as a separate and individual word.. You are.to.speak in'a normal, relaxed conversa— tional tone of voice. Begin each word from a closed and resting mouth position and return to that position at the end of the word. You are to say eachlwordLupon-a;verbal signal from the 139Parks,loc. cit.. 81 experimenter. 'Ther ‘will-be'a;short interval between each utterance. Do;not.attempt to anticipate the Signal. You will repeat each word five times. The experimenter will help you keep your place on the list by stating the number of the word as you begin a new word. Once again, remember to begin each.word from a closed-resting mouth position and return to that position after each word. Are there any questions? The subject then produced the words on the list of stimulus items. Each word was repeated five times, each utterance produced upon a verbal signal from the experimenter. A minimum of two seconds was maintained between-each repetition with this inverval varied in order to prevent a patterned response from occurring. Timing was accomplished by means of a timed stylus on-the oscillograph which marked one-second intervals. In addition, the signal for an utterance was not given until the subject had returned to a closed and resting mouth position as indicated by the writing stylus of the oscillograph returning to a stable position on the baseline. MeasurementS-—Each time the subject uttered one of the stimulus words, the stylus of the oscillograph was deflected from the baseline of the recording paper. This left a tracing that reflected the relative intensity of the facial movement occurring on the area of the face covered by the strain gauge over the time it took to occur and also indicated changes in movement pattern as they took place. Changes in the movement pattern were indicated on the tracing by changes in the direction of the traced curve from positive 82 to negative, negative to positive direction, or a period of zero change of direction on the tracing. These changes were termed inflection points and constituted points of orientation for certain of the measurements. The beginning of a tracing, or curve, was defined as the point at which the curve separated from the baseline, and the termination of the word as the point where the curve again joined the baseline. Six individual measures were made on these tracings. For the purpose of obtaining representative measures of the production of each word, three of the five utterances of each word by each subject were selected for measurement on each of the Six measures. Those utterances were selected for measurement that presented the greatest degree of similarity to each other, first in general configuration of the tracings, and secondly in amplitude of the tracings. A mean score value for each of the Six measures of these three utterances was then obtained for each word. This value was taken as a representative score for each of the measures on each word for every subject. Each of these Six measures is an attempt to characterize the curve obtained for each word. One of these measures is simply a count of the number of inflection points (IP), as defined earlier, appearing on the resultant curve for a given utterance of a word. This measure gives an estimate of the number of changes in movement pattern occurring. The second 83 measure, termed the temporal summation to inflection points (T81) is a summation of the time elapsed from the initiation of the word to each of the inflection points occurring on that curve. This measure was obtained by a summation of the distance from the onset of the tracing, along the abscissa of the curve, to a point on the abscissa directly perpendicular to each of the inflection points of the curve. This measure gives an estimate of the time elapsed to each change in.movement pattern. The third measure, called the summation of the amplitudes at inflection points (SAI), is a summation of the amplitude of the curve at each of the inflection points. This measure was obtained by a summation of the extent of the deflection of the curve from the baseline at each inflection point. The fourth measure attempts to integrate the second and third measure into one estimate of the intensity of the movement as a function of the time elapsed across that movement, combining elapsed time and intensity. This measure was termed an integrated amplitude—duration measure (IAD) and was obtained by a summation of the products of the time elapsed to each inflection point and the respective amplitude at those points. The fifth measure is the total duration (D) of the curve giving an estimate of the time taken to utter the word. The Sixth and final measure determines the surface area enclOsed by the curve and the corresponding baseline. This measure gives an estimate of the intensity of movement over the 8A total time elapsed to utter the word. This measure was obtained with a polar planimeter. The planimeter has the capacity to measure the area which iS bounded by closed lines when these lines are entirely circumtraced with the tracer point of the instrument. The measuring unit reading gives the Size of the area expressed in vernier units which can be converted to any standard unit of measure}!40 AS used for the present experimental measure, the pole arm was maintained at a constant setting of 3A1 and the tracer arm at a setting of 12. At these settings, a reading of 17 vernier units is equal to one square centimeter. The planimetric measures were obtained by setting the tracer point of the planimeter at the beginning point of the curve as defined earlier, tracing the curve in a clockwise direction to the termination of the curve, and back along the baseline to the starting point° The original stimulus word list contained three items with the homophenous consonant in the initial position of the word and three items with the homophenous consonant in the final position of the word. This pattern was maintained across the three subgroups of each homophenous cluster to give a more adequate representation of each consonant to reduce the luoOtt Planimeter, Instruction Manual (Kosel, Kempten, Germany), Dist. by Frederick Post Co., Chicago. 85 possibility of Spurious results because of the possible influence of a specific vowel on the consonant. For the statistical analysis of the data, a mean score was tabulated across the three words appearing for each consonant in each of the two positions, initial and final, for each of the six measures employed. These mean scores for each consonant in the initial position and for each consonant in the final position served as the score value for a given subject in the statistical analysis of the resultant data. CHAPTER IV RESULTS AND DISCUSSION The section on the Six measurements in Chapter III discussed the procedures used for making six individual measurements of the resultant tracings obtained from each speaker's production of the list of stimulus words. The score value for each subject on each measure was a mean of three utterances of each of the words that contained the consonant under consideration in either initial or final position. The six measures described in Chapter III were as follows: number of inflection points (IP), temporal summation to inflection points (TSI), summation of amplitude at inflection points (SAI), integrated amplitude- duration (IAD), duration (D), and area (A). For each speaker then, there were six score values for each of three consonant clusters obtained through six individual measures for a total of 108 score values for each Speaker. These scores values are presented in a table of raw data in Appendix A. The data were subjected to factorial design 2 x 3 x 2 repeated measures analysis of variance (Case I) as outlined by Winer.lul This design was utilized a total of 18 times, l”1B. J. Winer, Statistical Principles in Experimental Design (New York: McGraw-Hill, 1962), p. 319. 86 87 one for each of the six measures on each of the three consonant‘clusters. The results of the analysis of the homophenous cluster /p, b, m/ on the T81 measure are presented in Table 2. In presenting the results of this analysis and all further analyses and discussion, the following symbols will be used throughout: FE--female Speaker M—-male speaker I --initial position F-—final position In order to determine where the differences lie among treatment means following a significant overall F, the Newman-Keuls procedure outlined by Winerlu2 was followed. The results of these individual comparisons are summarized schematically using the above symbols in conjunction with the several respective consonant symbols to represent the treat- ment means. Treatments underlined by a common line do not differ from each other significantly; treatments not under- lined by a common line do differ from each other. A significance level of .05 was used throughout in reporting the individual comparisons. An example of this method of reporting is as follows: 1,2 3 A 5. Here, treatment five differs from treatments one and two but not from three and four. All treatment means are ordered left to right from lowest to highest. lu2Winer, Ibid., pp. 80,390. 88 Table 2.--Summary of analysis of variance comparing dif— ferences in temporal summation to inflection points among consonants /p, b, m/ as a function of Speaker sex and of word position. Source of Variation SS df MS F Sex (A) 110.97 1 110.97 0.21 ns S's Within Groups A1A1.66 8 517.71 Consonant (B) A88.23 2 2AA.11 10.27 * A X B 227.29 2 113.65 A.78 # B x 8'3 Within Grps. 380.AA 16 23.78 Position (C) 10001.6A l 10001.6A 1AA.63 * A x C 67.71 1 67.71 0.98 nS C x 3'8 Within Grps. 553.21 8 69.15 B x C 739-89 2 369.911 12.3A * A x B x C 186.87 2 93.A3 3.11 nS BC x S's Within Grps. A79.56 16 29.97 Total 17377-48 59 nS--non-Significant. *--Significant beyond .01 level. #--significant beyond .05 level. The analysis of variance summary table presented in Table 2 shows the analysis of the /p, b, m/ consonant cluster on the temporal summation to inflection points measure. This measure gives an estimate of the time elapsed to each inflection point or change in the pattern of facial movement during the production of a word. The results of this analysis of the /p, b, m/ cluster indicate that the main effects showing statistical significance are the homophenous consonants and word position. The factor 89 of word position is not meaningful to this study in and of itself because this effect compareswords that have the consonants in the initial position to those having the consonants in the final position. Since the words have no other relation to each other (are not homophenous), no inferences can be drawn from this factor, other than that there are differences among different words in amount or rate of facial movement and that that difference can be measured. Of much more meaning is the consonant-by—position interaction effect that the presence of the position factor allows to be studied and thus yields a test for deter- mining whether differences occur among the consonants as a function of initial or final position with a word. The results*of the individual comparisons of the con— sonant effect, performed as described earlier in this section, were as follows: 9.2.2 It can be seen that significant differences exist between the /p/ and the /m/, and between /b/ and /m/ across speaker sex and position of the sound in a word. As determined by the TSI measure, the time elapsed to the changes in pattern of facial movement appears to be greater for the /b/ and /p/ than for the /m/ with non—significant differences between the first two sounds. Individual comparisons of the sex—by-consonant inter- action effect produced the following results: 90 Mm FEm.FEb FEp Mp Mb Here again, differences are significant between the /p/ and /m/, and /b/ and /m/ for the male Speakers and /p/ and /m/ for female speakers. However, for the female speakers /p/ is also different from the /b/, which in this case is not different from the /m/, as determined by the TSI measure across word position. In addition, male production of the /p/ and /b/ result in higher mean scores on this measure than do the female Speakers on all three consonants. The consonant-by-position interaction effect yielded the following results from individual comparison of the treatment means contributing to that effect: Ib Iplmflmm Here it can be seen that significant differences occur between /p/ and /m/, and /b/ and /m/ when those sounds occur in the final position of the word. It would appear that the overall significant F was largely due to the effect of the sounds when in the final position, as determined by this measure. The results of the analysis of the /p, b, m/ cluster on the SAI measure are presented in Table 3. It can be seen that significant differences were obtained among the three homophenous consonants /p, b, m/. Significance was also obtained for the sex-by—consonant, consonant-by-position, and sex-by-consonant-by-position interaction effects. Individual comparisons of the 91 Table 3.-- Summary of analysis of variance performed to deter- mine whether summation of amplitudeS'at inflection points differed among the consonants /p, b, m/ as a function of speaker sexjand.of“word position. Source of Variation SS df MS F Sex (A) 7 1709.23 1 1709.23 u.37 ns S's Within Groups 3126.30 8 390.79 Consonant (B). 188.78 2 9A.39 7.68 * A x B 35.51 2 17.75 1.AA ns B x S's Within Grps. 196.62 16 12.29 Position (C) 129.89 1 129.89 2.81 ns A X C 2A9.53 1 2A9.53 5.140 # C x S's Within Grps. 369.89 8 A6.2A B x C 96.91 2 A8.A6 A.25 # A X B X C 12A.37 2 62.19 5.95 # BC X S's Within Grps. 182.30 16 11.39 Total 6A09.33* 59 ns--nonesignificant. *--Significant beyond .01 level. #--Significant beyond .05 level. treatment means of the consonant effect yielded the following results: m b p Recalling that those treatments underlined by a common line do not differ significantly from each other, it can be seen that again /p/ differs from /m/ as found in the previous test. Here, however, /b/ and /m/ do not differ, as measured by the SAI measure, a summation of intensity of movement at each of the changes in pattern of facial movement 92 on each word, across speaker sex and word position. The trend of the treatments is the same as before with /p/> /b/ >/m/. Comparison of the sex-by-position treatment means were: FEF FEI MI pg When the consonants occur in the final position of the word. male speakers show significantly more intensity of movement as measured by the SAI measure than do female speakers across the three sounds. The consonant-by-position interaction effect treatment means when subjected to individual comparison yielded the following results: Fm Im Ip Ib Fp'Ep The /p/ in the final position is significantly greater in intensity of facial movements as measured by the SAI measure than the /b/ or /m/ in the final position. Likewise, the /b/ is significantly greater than the /m/ in that position, both differences across Speaker sex. Individual comparisons of the sex-by—consonant-by-position interaction were as follows: FEFb FEFm FEIm FEIp FEIbFEFp MIp MIb MFm MIm MFb MFp In male speakers, it can be seen that the final /p/ and /b/ show significantly greater intensity of movement as measured by the SAI measure. With the female Speakers, the final /p/ was significantly different from both the final /m/ and /b/. When the three sounds occurred in the final position of a word, male speakers showed significantly more 93 intensity of facial movement at the changes of pattern of movement than did female speakers and likewise when the sounds were in the initial position. This finding is interesting since the sex factor shown in Table 3 did not Show statistical significance on the overall F test, but the differences do Show up in important segments of the individual comparisons. It should be noted that the approximate Significance probability of the obtained F for the sex factor was .07, explaining part of this discrepancy. Finally, it is noted once again that differences were not found for these sounds in the initial position. The final position of the consonants appears to contribute most to the overall significant F test. The results of the analysis of the integrated amplitude- duration measure on the /p, b, m/ consonant cluster are pre— Sented in Table A. Here again, Significant differences were obtained among the three consonant sounds and between the two positions. The sex-by-position, position-by—con- sonant, and sex-by-consonant-by-position interaction effects were also statistically significant. Individual comparisons of the consonant means were as follows: m b p Consistent with past results, the /p/ is significantly higher on the IAD measure than is the /m/. Comparison of the means associated with the consonant-by-position effect were: Ib Im Ip Fm Fb Fp 9A Table A.--Summary of analysis of.variance performed to test differences in the integrated amplitude—duration measure among the consonants /p, b, m/ as a function of speaker sex and of word position. Source of Variation SS df MS F Sex (a) 33A6A3.57 1 33A6u2.57 5.18 ns S's Within Groups 516659.6A 8 6A582.A5 Consonant (B) A3551.87 2 21775.9A 8.73 * A x B 5AA2.98 2 2721.A9 1.09 ns B x S'S Within Grps. 39901.61 16 2A93.85 Position (C) 1A1298.98 1 1A1298.98 11.61 * A x C 52972.3A l 52972.3A A.35 ns C x 8'5 Within Grps. 9738A.35 8 12173.0A B X C 33007.88 2 16503.9A 7.69 * A x B x C 1980A.98 2 9902.A9 A.62 # BC X 8'8 Within Grps. 3A316.98 16 21AA.81 Total 131898A.18 59 nS--non-Significant. *--Significant beyond .01 level. #--Significant beyond .05 level. Once again, the final position of the consonants seems to be contributing most to the overall F for consonants, with differences appearing between the /p/ and /m/, and /b/ and /m/ in the final position only. Individual comparisons of the sex-by—consonant—by-position interaction effect results were: FEIm FEIbeEFm FEIp FEFb MIb FEFp_MIm MIp MFm MFp MFb For the male Speakers, when the consonants are in the final position of the word, the /p/ and /b/ are again 95 significantly greater than the /m/ as determined by this measure of intensity of facial movement over time elapsed to inflection points. For female Speakers, the /p/ in the final position shows greater facial movement on this measure than the /b/ or /m/ in that position. When the consonants are in the final position, male Speakers showed more facial movement than did female speakers on this measure, consistent with past results. Finally, when the sounds were in the initial position, males again showed significantly more facial movement as measured by the IAD measure than the female Speakers. Here again, the overall F on the sex factor was not statistically significant.. However, the approximate significance probability of the F as provided by the CDC 3600 computer was .05, despite the fact that table values of the F statistic show this factor to be non- significant. This would explain the differences obtained for the individual comparisons. The results of the analysis of the duration (D) measure on the /p, b, m/ cluster are presented in Table 5. The sex factor is seen to be non-significant with an approximate significance probability of the F statistic of .08. The position effect is Significant but is not meaningful for interpretation as explained in an earlier section. No other effect showed statistical significance on the duration measure of the /p, b, m/ cluster. 96 Table 5.-—Summary of analysis of variance comparing dif- ferences in duration among the consonants /p, b, m/ as a function of speaker sex and word position. Source of Variation SS df MS F Sex (A) 282.36 1 282.36 A.2A ns S's Within Groups 532.17 8 66.52 Consonant (B) 9.09 2 A.5A 0.88 ns A x B 9.67 2 A.83 0.9A ns B x S's Within Grps. 82.35 16 5.15 Position (C) A2.8A l A2.8A 10.91 # A X C 15.02 1 15.02 3.83 ns C X 8'8 Within Grps. 31.A1 8 3.93 B x C 20.11 2 10.05 1.80 ns A x B x C 10.80 2 5.A0 0.97’ns BC x S's Within Grps. 89.31 16 5.58 Total 1125.13 59 ns-—non—Significant. #--significant beyond .05 level. Table 6 shows the results of the analysis of the /p, b, m/ consonant cluster on the area (A) measure. It can be seen that statistical significance was obtained for the sex variable with the respective means being 36.96 for the males and 20.70 for the females. Thus the male speakers appear to utilize more facial movement over the total duration of a word during the production of homophenous words than do female speakers. Statistical Significance was also obtained among the three consonants /p, b, m/ between positions and for the consonant-by-position interaction 97 Table 6.--Summaryof analysis of variance performed to determine whether area under the curve differed among the consonants /p, b, m/ as a function of Speaker sex and of word position. Source of Variation SS df MS F SeX (A) - 396A.l9 1 396A.l9 12.81 * S's Within Groups 2A75.A3- 8 309.A3 Consonant (B) 1A9.A1 2 7A.70 5.11 # A x B 8.7A 2 A.37 0.30 ns B x 3'5 Within Grps. 233.89 16 1A.62 Position (C) 1525.91 1 1525.91 19.03 * A x C 0.37 l 0.37 0.00 nS C X 8'8 Within Grps. 6A1.33 8 80.17 B X C 105.88 2 52.9A 5.6A * A x B x C 15.03- 2 7.51 0.80 ns BC x 8'8 Within Grps. 150.27 16 9.39 Total 9270.A5 59 ns--non-significant. *--Significant beyond .01 level. #-—significant beyond .05 level. effect. The results of individual comparisons of the three consonant means were as follows: EJLP. Here the order of the consonants is the same as in previous results but no significant differences were obtained. The consonant-by-position interaction effect produced the following results from individual comparisons. __2FmF FblTlElE 98 In the initial position of a word, the /p/ and /b/ both Show significantly greater facial movement on this measure of area under the curve unnldoesthe /m/, and the /p/ likewise greater than the /b/. In addition, the final /b/ shows greater facial movement on this measure than the final /m/. Here we see the initial position showing higher mean scores than on any of the previous measures in relation to the final position, but the order for the consonants remains the same as before with /p/ > /b/ > /m/ in final position here. The results of the analysis of the number of inflection points as a measure of the number of changes in facial movement patterns are presented for the /p, b, m/ cluster in Table 7. Significant differences in number of inflection points were found among the three consonants, between initial and final position, and as a result of the sex—by—consonant and consonant-by-position interaction effects. Individual comparisons of the consonant means were as follows: m b p It is apparent that the /p/ shows significantly more changes in facial movement pattern as measured by the number of inflection points than does the /m/ across speaker sex and word position. Individual comparisons of the consonant means gave the following results: Mm FEm FEb Mp Mb FEp 99 Table 7.-—Summary of analysis of variance performed to test differences in the number of inflection points among the consonants /p, b, m/ as a function of speaker sex and of word position. Source of Variation SS df MS F Sex (A) 0.02 l 0.02 0.01 ns S's Within Groups 19.91 8 2.A9 Consonant (B) 2.36 2 1.18 9.67 * A X B 1.12 2 0.56 A.59 # B x 3'8 Within Groups 1.95 16 0.12 Position (C) A5.59 l A5.59 88.52 * A x C 0.01 l 0.01 0.02 ns C x S's Within Groups A.l2» 8 0.51 B x C A.79 2 2.A0 1A.05 * A x B x C 0.85 2 0.A3 2.A9 ns BC X S's Within Groups 2.73 16 0.17 Total 83-"5' 59 ns-—non-significant. *--Significant beyond .01 level. #--significant beyond .05 level. For the female speakers, the /p/ showed Significantly more changes in movement pattern than the /m/ or /b/. In male speakers, the /b/ showed a similar relationship to /m/, but not to /p/. Male production of the /b/ Showed significantly more inflection points as a measure of changes in movement pattern than did female production of /m/ or /b/ across word position, and female production of /p/ Showed more inflection points than male production of /m/ across word position. Comparison of the consonant-by-position interaction effect revealed the following results: 100 lb Ip 1m Fm £2.28 Across male and female Speakers the /p/ in final position Showed significantly more inflection points than the /b/ in final position, which in turn resulted in more inflection points than the [m/ in the final position. The null hypothesis tested by the foregoing pro- cedures was as follows: There are no significant differences in certain facial movements among the three homophenous consonants /p, b, m/ as a function of Speaker sex and of word position as determined by six individual measures. With regard to that aspect of the null hypothesis concerning no significant differences between male and female speakers, the hypothesis is rejected for the area measure, but is not rejected for the other five measures; TSI, IAD, SAI, D, and IP. That portion of the null hypothesis concerning no Significant differences among the three consonants /p, b, m/ is rejected for the measures TSI, SAI, IAD, A, and IP but is not rejected for the D measure. That aSpect of the hypothesis regarding no Significant differences among the three consonants /p, b, m/ as a function of word position is rejected for all measures with the exception of the D measure for which the hypothesis is not rejected. The results of the analysis of the /t, d, n/ con- sonant cluster on the T81 measure are presented in Table 8. 101 Table 8.--Summary of analysis of variance comparing diff- erences in temporal summation to inflection points among consonants /t, d, n/ as a function of speaker sex and of word position. m Source of Variation SS df. MS F Sex (A) 25.09 1 25.09 0.0A nS S's Within Groups 5227.32 8 653.A2 Consonant (B) A3.75 2 21.87 0.57 ns A X B 26.53 2 13.27 0.35 nS B x S's Within Grps. 612.02 16 38.25 Position (C) 813.87 1 813.87 32.A3 * A X C 389.33 1 389.33 15.51 * C x 8'8 Within Grps. 200.78 8 25.10 B x C 101.67 2 50.8A 1.1A ns A x B x C A0.73 2 20.37 0.A5 ns BC x 8'5 Within Grps. 716.25 16 AA.77 Total 8197.3A 59 nS-—non-significant. *--significant beyond .01 level. This measure it will be recalled, gives an estimate of the time elapsed to each change in movement pattern during the production of a word. It can be seen from Table 8, that the position variable is again statistically significant but it will not be discussed for the reasons presented previously. Also significant is the sex-by-position interaction effect. The Newman-Keuls procedure for comparison of individual differences on this effect on the TSI measure produced the following results: MI FEI FEF _M_F_‘ 102 When the consonants of concern are in the final position, males Show Significantly more facial movement as determined by the TSI measure than do female Speakers across the three consonants /t, d, n/. The analysis of variance summary table for the SAI measure on the consonants /t, d, n/ is shown in Table 9. Table 9.--Summary of analysis of variance performed to determine whether summation of amplitudes at inflection points differed among consonants /t, d, n/ as a function of speaker sex and of word position. Source of Variation SS df MS F Sex (A) 1803.A7 1 1803.A7 2.89 ns S's Within Groups A991.A6 8 623.93 Consonant (B) A.71 2 2.35 0.20 ns A x B 29.76 2 lA.88 1.30 nS B x S's Within Grps. 183.68 16 11.A8 Position (C) 38.00 1 38.00 2.21 ns A X C 106.A8 1 106.A8 6.18 # C x S's Within Grps. 137.81 8 17.23 B X C I 13.71 2 6.86 0.8A ns A X B X C 3.09 2 1.5A 0.19 ns BC x 8'3 Within Grps. 129.97 16 8.12 Total 7AA2 .15 59 nS--non-significant. #--Significant beyond .05 level. The only effect that is statistically significant is the sex-by-position interaction effect. Individual comparisons of the treatment means involved in the effect yielded the following results: 103 FEF FEI MI MF These results indicate that male speakers have more facial movements than female speakers as determined by the SAI measure when the consonants are in either initial or final position of the word across the three consonants tested. This would indicate that males utilize more intensity of facial movements evident at the changes in movement pattern than do females. The results of the analysis of the IAD measure on the consonants /t, d, n/ are presented in Table 10. This measure integrates the intensity of facial movement with the time elapsed to each of the changes in the pattern of facial movement during the production of a word. The only factor of statistical Significance here is the sex-by- position interaction effect. Tests on individual means of this effect by the usual procedure yielded the following: FEF FEI MI MF Here again, when the consonants of concern are in the initial or final position of the word but examined separately, male SpeakeFS evidence more facial movements as determined by this measure than do female speakers across the three consonants /t, d, n/. These results then are consistent with the prior results in this study. The summary table for the analysis of the D measure on the consonants /t, d, n/ is shown in Table 11. 10A Table 10.--Summary of analysis of variance performed to test differences in the integrated amplitude-duration measure among the consonants /t, d, n/ as a function of Speaker sex and of word position. W Source of Variation SS df MS F Sex (A) 228053.21 1 228053.21 2.17 ns S's Within Groups 8A1201.7A 8 105150.22 Consonant (B) 200A.39 2 1002.19 0.A6 ns A x B 7A89.61 2 37AA.80 1.7A nS B x S'S Within Grps. 3AA90.98 16 2155.69 Position (C) 1u33.36 l 1A33.36 0.36 nS A X C 33781.08 1 33781.08 0.57 # C x S's Within Grps. 31525.1A 8 39A0.6A B X C 5370.53 2 2685.26 0.85 ns A x B x C 618.37 2 309.19 0.10 ns BC x S'S Within Grps. 50A72.7l l6 315A.5A Total 1236AA1.12 59 nS--non-Significant. #--Significant beyond .05 level. Here it can be seen that the only factor that is statistically Significant is that of word position. AS indicated previously, no inferences can be legitimately drawn from this result. The results of the analysis of the area measure of the consonants /t, d, n/ are shown in Table 12. It is found that statistical significance is obtained for the sex factor and for the position factor. The male Speaker group Showed a mean of 36.32 and the female group a mean of 20.A8 across the consonants and word position. No other significant differences were Shown. 105 Table 11.--Summary of analysis of variance performed to test differences in total duration among the consonants /t, d, n/ as a function of speaker and sex and of word position. Source of Variation SS df MS F Sex (A) 157.82 1 157.82 1.57 ns S's Within Groups 805.56 8 100.70 Consonant (B) 9.A6 2 A.73 1.09 ns A x B 21.56 2 10.78 2.A9 ns B x 8'5 Within Grps. 69.37 16 A.3A Position (C) 37.62 1 37.62 6.92 # A x C A.39 l. A.39 0.81 ns C x 8'3 Within Grps. A3.50 8 5.AA B x C 1.80 2 0.90 0.1A ns A X B X C 1.22 2 0.61 0.09 ns BC x S's Within Grps. 105.77 16 6.61 Total 1258.08 59 nS--non-Significant. #--Significant beyond .05 level. The results of the analysis of the number of inflection points measure of the /t, d, n/ consonant cluster is presented in Table 13. Here statistical significance is obtained for the position factor and for the sex-by-position interaction effect. Comparison of the individual means of the latter effect by the Newman—Keuls procedure produced the following results: MI FEI FEF Mg The difference between means that is meaningful to this study is that the male Speakers showed more changes in pattern of facial movement as determined by the IP measure than 106 Table 12.--Summary of analysis of variance comparing dif- ferences in area under the curve among the consonants /t, d, n/ as a function of Speaker sex and of word position. Source of Variation SS df MS F Sex (A) 3766.59 1 3766.59 7.56 # S's Within Groups 398A.8l 8 A98.10 Consonant (B) 21.03 2 10.51 .21 ns A X B 27.62 2 13.81 .59 ns B x S'S Within Grps. 138.77 16 8.67 Position (C) 1838.95 1 1838.95 31.A8 * A x C 0.71 1 0.71 0.01 ns C x S's Within Grps. A67.A0 8 58.A3 B x C 32.81 2 16.A0 1.79 ns A x B x C 25.A7 2 12.7A 1.39 ns BC x 8'8 Within Grps. 1A7.02 16 9.19 Total 10A51.19 59 nS--non-Significant. *--significantbeyond .01 level. #--Significant beyond .05 level. Table 13.--Summary of analysis of variance performed to determine whether number of inflection points differed among consonants /t, d, n/ as a function of speaker sex and of word.position. Source of Variation SS df MS F Sex (A) 0.03 1 0.03 0.01 ns S's Within Groups 22.A6 8 2.81 Consonant (B) 0.3A 2 0.18 1.01 ns A x B 0.27 2 0.1A 0.83 nS B x 3'8 Within Grps. 2.66 16 0.17 Position (C) 10.95 1 10.95 185.21 * A X C 1.10 l 1.10 18.5A * C x 8'8 Within Grps. 0.A7 8 0.06 B X C 0.56 2 0.28 1.30 ns A x B x C 0.07 2 0.03 0.16 nS BC x S's Within Grps. 3.A2 16 0.21 Total A2.33 59 ns--non—Significant. *—-significant beyond .01 level. 107 did female speakers when producing words with the consonants of concern in the final position. The null hypothesis tested by the foregoing procedures was as follows: There are no Significant differences in certain facial movements among the three homOphenous consonants /t, d, n/ as a function of Speaker sex and of word position as determined by Six individual measures. With regard to that portion of the hypothesis concerning no significant differences as a function of Speaker sex, the hypothesis is rejected for the area measurement but is not rejected for the other five measures: TSI, SAI, IAD, D, and IP. That aspect of the hypothesis regarding no significant differences among the three consonants /t, d, n/, fails to be rejected for all Six of the measures. Finally, the section of the hypothesis concerning no significant differences among the consonants as a function of word position fails to be rejected for all six measures. The analysis of variance summary table for the TSI measure on the consonants /tf, d3,f/ is shown in Table 1A. This measure gives an estimate of the time elapsed to each of the changes in the pattern of facial movement produced during the production of the words. Statistically significant differences were found for the position effect which is not meaningful to this discussion as stated earlier. Also, significant differences were found as a result of the consonant-by-position interaction effect. Tests of the 108 Table 1A.--Summary of the analysis of variance comparing differences in temporal summation to inflection points among theconsonants / ti, d , f/ as a function of speaker sex and of word position.. Source of Variation. SS df. MS F Sex (A) 0.20 1 0.20 0.00 nS S's Within Groups 3801.1A 8 A75.1A Consonant (B) 63.38 2 31.60 1.19 ns A x B 119.06 2 59.53 2.2A ns B x S's Within Grps. A25.u7 16 26.59 Position (C) 28A2.82 l 28A2.82 13.12 * A x C 0.28 l 0.28 0.00 ns C x S's Within Grps. 173A.07 8 216.76 B X C 328.69 2 16A.35 A.2A # A x B x C 69.69 2 3A.85 0.90 ns BC X S's Within Grps. 620.39 16 38.77 Total 10005.20 59 ns--non-Significant. *--significant beyond .01 level. #-—significant beyond .05 level. individual means contributing to that effect were performed as usual and yielded the following results: til dpl [142% It can be seen that when the consonants appear in the final position of the word, both /tf/ and /d7/ differ significantly from [f/ in amount of facial movement as measured by the TSI measure across speaker sex. The analysis of the SAI measure on the consonants /t.& dirf/ is presented in Table 15. The only factor of 109 Table 15.--Summary of analysis of variance performed to test differences in summation of amplitudes at inflection points among the consonants /t L d , f/ as a function of speaker sex.and.of-word-position. W Source of Variation- SS. df MS F Sex (A) W 1799.A1 1 1799.A1 A.2l ns S's Within Groups 3A21.0A 8 A27.63 Consonant (B) ' A.59 2 2.30 0.11 ns A X B A8.00 2 2A.00 1.16 ns B x 8'5 Within Grps. 329.95 16 20.62 Position (C) 2672.5A l 2672.5A 2A.90 * A X C 0.1A 1 0.1A 0.00 ns C x 8'8 Within Grps. 858.79 8 107.35 B x C 90.52 2 A5.26 2.51 ns A X B X C 22.66 2 11.33 0.63 ns BC x 8'8 Within Grps. 289.08 16 18.07 Total 9536.72 59 ns--non-significant. *--significant beyond .01 level. Significance in this analysis is that obtained from the effect of word position which is not meaningful to this discussion. It Should be noted that the sex factor approached significance with an approximate Significance probability of .08. The summary table for the analysis of the IAD measure on the / t , d}, f/ cluster is presented in Table 16. The sex variable again approaches significance with an approximate significance probability of the F statistic of .09. The only factor that does reach statistical significance 110 Table l6.--Summary of analysis of variance comparing differences .in the integrated amplitude—duration measure among the consonants /tf, d33.f/ as a function of speaker sex and of word position. Source of.Variation. SS df MS F Sex (A) 225A99.A1 1 225A99.A1 3.7A nS S's Within Groups A82982.23 8 60372.78 Consonant (B) 6A6.30 2 323.15 0.06 ns A X B 7338.00 2 3669.00 0.69 ns B x S'S Within Grps. 8A8A2.92 16 5302.68 Position (C) A66675.02 1 A66675.02 18.01 * A X C 6A55.33 1 6A55.33 0.25 nS C X 8'8 Within Grps. 2072A7.06 8 25905.88 B X C 27792.78 2 13896.39 3.A9 nS A X B X C 7515.9A 2 3757.97 0.9A ns BC x S's Within Grps. 63736.18 16 3983.51 Total 1580731.17 59 nS--non-significant. *--Significant beyond .01 level. is that of position which is again not meaningful to the results. The analysis of variance summary table for the D measure of the consonants /tf, d},.f/ is presented in Table 17. Here again the only factor of statistical significance is that of position, again not meaningful to this study by itself. The summary table for the analysis of the area (A) measure on the /tf, djg f/ consonant cluster is given in Table 18. Here statistically significant differences are Table l7.--Summary.of analysis of variance comparing differ- 111 ences in total duration among the consonants /tr, d35.f/ as a function of speaker sex and of word position. Source of Variation SS df MS F Sex (A) 92.21 1 92.21 1.27 ns S's Within Groups 581.72 8 72.71 Consonant (B). 5.68 2 2.8“ 0.78 ns (A x B 2.32 2 1.16 0.32 ns B x 8'5 Within Grps. 58.A7 16 3.65 Position (C) 116.09 1 116.09 16.28 * A x C ' 1.23 1 1.23 0.17 ns C x 8'5 Within Grps. 57.05 8 7.13 B x C 8.71 2 A.36 3.26 ns A x B x C 8.51 2 4.26 3.19 ns BC x 8'8 Within Grps. 21.35 16 1.33 Total 953.34 59 ns--non-significant. Table 18.--Summary.of analysis of variance performed to test differences in area under the curve among the consonants /tf, d], f/ as a function of speaker sex and of word position. *—-significant beyond .01 level. Source of Variation SS df MS F Sex (A) 3303.09 1 3303.09 11.77 * S's Within Groups 22u5.95 8 280.7u Consonant (B) 25.A3 2 12.71 0.78 ns A X B 66.02 2 33.01 2.02 ns B x 8'8 Within Grps. 261.75 16 16.36 Position (C) 1566.93 1 1566.93 19.16 * A X C 15.75 1 15.75 0.19 ns C x 8'5 Within Grps. 65A.15 8 81.77 B X C 3.98 2 1.99 0.20 ns A x B x C 32.95 2 16.A8 1.65 ns BC x S's Within Grps. 159.92 16 9.99 Total 8335.93 59 ns--non-significant. *--significant beyond .01 level. 112 obtained between male and female speakers with the mean scores being 37.83 and 22.99 respectively, males showing more facial movements as determined by this measure then females. This measure gives an estimate of the intensity of facial movement over the total duration of the word. Also significant was the position effect of this analysis. The analysis of the IP measure of the consonants /tf, dig f/ is presented in Table 19. Table l9.--Summary of analysis of variance comparing dif- ferences in number of inflection points among the consonants /tf,dj, f/ as a function of speaker sex and of word position. Source of Variation SS df MS F Sex (A) 0.02 1 0.02 0.01 ns S's Within Groups 21.41 8 2.68 Consonant (B) 0.07 2 0.03 0.58 ns A x B 0.28 2 0.14 2.41 ns B x 8'3 Within Groups 0.92 16 0.06 Position (C) 12.41 1 12.41 13.55 * A X C 0.00 1 0.00 0.00 nS C x 8'5 Within Grps. 7.33 8 0.92 B X C 1.11 2 0.55 6.11 * A X B X C 0.04 2 0.02 0.22 ns BC x 3'8 Within Grps. 1.45 16 0.09 Total 45.04 59 ns--non-significant. *-—significant beyond .01 level. 113 Statistical significance is again obtained for the position effect in this analysis. The consonant-by- position interaction effect also shows statistical significance. The results of the comparison of individual means contributing to this effect are as follows: 1;]: 6151 [I [F d}F tL'F None of these differences is meaningful to this discussion since the design of the experiment does not permit generalization from differences obtained between initial and final position of the consonants. The null hypothesis tested by the preceding procedures was given as follows: There are no significant differences in certain facial movements among the three homophenous consonants /t1, d ,J'/ as a function of speaker sex and of word p sition as determined by six individual measures. That aSpect of this hypothesis concerning no significant differences as a function of speaker sex is rejected for the area measure. The results lead to failure to reject that aspect of the hypothesis for the other five measures; TSI, SAI, IAD, D, and IP. The portion of the hypothesis concerning no significant differences among the three consonants /t , djg f / fails to be rejected for all six measures. That portion of the hypothesis concerning no significant differences as a function of word position is rejected for the TSI measure and the IP measure, but is not rejected for the SAI, IAD, D, and A measures . 114 Discussion Speaker sex.--Several writers have suggested that speaker sex is an important variable in the lipreadability of speakers. The character of lip movement, chin, and jaw movements, and other factors were given by FusfieldlLl3 as variables affecting the lipreadability of the speaker. Others have listed flexibility of lip movement and mobility of facial expression as contributing to the lack of uni- formity among speakers. Many researchers have reported that all speakers are not uniformly lipreadable. In fact, several studies have shown highly significant interspeaker differences. In the present study, significant differences were found between male and female speakers on one of the six measures consistently that of the area under the resultant curve, across each of the three consonant clusters. It will be recalled from the discussion in Chapter III that this measure is believed to give an estimate of the intensity of the facial movement occurring on the measured areas of the face over the total time required to say each word. On each of the three analyses where significant differences were found, male speakers scored higher than females, giving evidence to a greater amount of facial movement occurring. In addition, under all conditions, male speakers consistently scored higher than female Speakers, although statistical ll"3Fusfie1d, loc. cit. 115 significance was not obtained on these measures. On several of the interaction effects which included sex as a variable, males were found to be showing significantly more facial movement than female speakers when the consonant appeared in the final position of the word. Within the /p, b, m/ cluster, three additional tests of the sex factor are worthy of comment since they closely approached statistical significance. The SAI measure, which gives an estimate of the intensity of movement at each change in the pattern of facial movement, yielded an approximate significance probability of the F statistic of .07. The IAD measure, which integrates time elapsed with amplitude, or intensity, at each inflection point was found to be nearly significant with the CDC 3600 computer reporting an approximate significance probability of .05. These same two measures obtained over the /tf, d], .f/ cluster yielded significance probability values of .08 and .09 respectively. These lend additional evidence that male speakers do indeed present more facial movement in at least certain areas of the face than do female speakers, and the time elapsed to certain changes in the pattern of facial movement during the production of monosyllabic words appears to be longer. The duration (D) measure for the /p, b, m/ cluster was also greater for males than for females at an approximate significance probability of .08. Thus, males appear to take a little more time to produce these words, to Speak a 116 little more slowly, then do females. These findings re- 144 who garding duration are in disagreement with Guttman, reported that duration was longer and word rate was slower for the female group than for the male group. This study has demonstrated differences between male and female speakers in terms of the facial movement that occurs on certain areas of the face during the production of the homophenous monosyllabic words utilized in this study. The differences are more pronounced in terms of the intensity of the facial movement over the total duration of the word, but are also present to some degree consistently on all other measures as well. 145 gp,_p, m/ cluster.--Black suggested that since voiceless continuant sounds had greater amounts of air pressure than other consonants, such differences may assist in visual identification of some consonants. Isshiki and Ringellu6 reported that air flow rate was greater for voiceless consonants than for voiced consonants. It was suggested in Chapter II of this report that such differences may also be accompanied by differences in facial movements that could assist in distinguishing between sounds. 147 Fujimura reported that lip opening during the first five milliseconds of a word was larger for the /p/ than for lLlLIGuttman, loc. cit. luSBlack, loc. cit. lu6Isshiki and Ringel, loc. cit. lL17Fujimura, loc. cit. 117 the /b/ or /m/ when the sounds occurred at the initial position of a word. He also reported that with respect to the area of mouth opening, an abrupt change in Speed of opening took place and was very apparent in /p/ and /b/ but not in /m/. The present study tends to support these results. Differences were found among the three consonants /p, b, m/ on five of the six measures. There were differences in time elapsed to changes in movement pattern in intensity of move- ment at those changes, in the integrated measure of the time elapsed and intensity, and in number of changes of movement pattern. Individual comparisons showed the /p/ and /b/ to present a greater elapse of time from onset of the word to each change in movement pattern than the /m/ across speaker sex. The /p/ showed greater intensity of movement at those changes than the /m/ across all speakers. No differences between the /p/ and /b/ or the /b/ and /m/ were found here. The /p/ likewise had a greater mean number of changes in movement pattern of facial movement as indicated by the IP measure than the /m/. Inspection of the interaction effect between con- sonant and word position revealed that most of this difference between consonants appears to be a result of the occurrence of the sounds in the final position of a word. The /p/ and /b/ in the final position, across all measures except overall duration, showed consistently greater facial movement in 118 terms of intensity and time to inflection points than the /m/ in final position across Speaker sex. With reSpect to duration of the word, no Significant difference was found but the ordering of the final consonant means was found to be consistent with the /p/ and /b/ greater than the /m/. In the initial position, one measure produced Significant results, that of the area measure which estimates intensity of movement over the total duration of the word. In this case, the order of the means was the same with /p/ greater than /b/, which in turn was greater than /m/; but here the initial position consonants gave greater scores than the final position consonants. Here as well, the final /b/ showed greater movement over the total duration of the word across Speaker sex than did the final /m/. The overall test for the three consonants across word position and speaker sex found the voiceless plosive showing consistently greater facial movement than the nasal continuant at Significant levels and consistently greater movement than the voiced bilabial plosive at non-significant levels. These results are consistent with those of Black and of Isshiki and Ringel if we can assume that the measure of physical phenomena in those studies have some relationship to physiological measures such as used in the present study. At the least, both increased air flow rate and increased air pressure appear to occur along with increased facial move- ment for the same types of sounds. 119 /t,rd,¥n/ cluster.--The results of the analysis of the /t, d, n/ cluster Showed no significant differences among these three sounds across speaker sex and word position. The same results were found for the interaction effects of the consonants with word position. Thus, this study was unable to demonstrate any Significant differences among these three sounds in terms of facial movements occurring during the production of words using these sounds in either initial or final position of a word. The interaction of sex by word position in four of the measures (excluding duration and area measures) showed a consistent order across all those measures for the con- sonants in the final position to yield higher mean values for male Speakers than for female speakers, reinforcing the differences as a function of Speaker sex discussed earlier. Since these three sounds are not labial sounds, it is not too surprising that no differences among the sounds was shown by a measure of surface facial movement. Any differences among them that may occur are likely to be so minute on the surface of the face that the present measures are too crude to be able to isolate them. Production of any one of these sounds does not appear to produce any differences from the other sounds on those parts of the face that were measured by this study. /tf, dz,.f/ cluster.--No significant differences were / found among the sounds in this cluster as a result of the 120 overall F tests on any of the six measures. Individual comparison of one significant interaction effect revealed that both the /tf/ and /d3/ in the final position showed a greater amount of time elapsed to changes in facial movement pattern than did the [f/, in the final position. Here we have a voiced and an unvoiced affricate, both being differ— ent from a voiceless continuent. The two affricates by definition have a plosive element in their production. This is Similar to the findings of the /p, b, m/ cluster in which the voiceless and voiced plosives differed from the continuant, which in that case is voiced. These findings would lead to the notion that one possible cause of the differences in facial movements is the plosive nature of the first two consonants {/p, b/, /tf, djy) in each cluster. Summary Several authorities have discussed the reasons for the lack of research being done in aural rehabilitation. Oyerlu8 gave five reasons, among which was the lack of adequate test instruments and the difficulty in isolating and controlling variables. Lowelll}49 supported this contention in stating that the development of measuring instruments, the yardsticks, was needed more than anything else. The present study has demonstrated a step toward the instrumentation needed for research in aural rehabilitation. 148 149 Oyer, "Research Needs. . .," loc. cit. Lowell, "Research: . . .," loc. cit. 121 It has not developed new instruments but has adapted presently available equipment to the use of research in this area. By so doing, this investigation has shown that it is possible to isolate and to control some of the variables that previously hindered such research. This instrumentation has allowed an examination of facial movements taking place during speech production and studied those movements as they accompanied certain aspects of the stimulus material, homophenous words. This study can be viewed as a step in the direction of better instrumentation for use along this line but demonstrating that such research is within reach. This investigation has shown that differences in facial movements do exist among certain of the so—called homophenous sounds. Many writers have said that these phonemes when Spoken in words look exactly alike on the Speaker's face. Others disagreed with this concept. The Roback study150 found that viewers of motion picture films could discriminate homophenous words out of context beyond that level expected by chance alone. The present study has shown that there are differences in rate and intensity of facial movements among some of these words that would contribute to the ability of viewers to discriminate among homophenous.words.- l50Roback, lOC. cit. 122 151 152 Stone, Fusfield, and Lowell153 have all listed lip mobility and facial expression as the two variables having the most pronounced and consistent effect on lip- reading performance, in that order. It is suggested that the present study was measuring lip mobility. The more mobile lips would be expected to Show more movement in terms of intensity and in the changes in movement patterns found in this study. It is possible that this is really one of the differences between male and female speakers found in the present investigation. Here then is a possible means of objectively determining part of the differences in lip- readability of Speakers for purposes of lipreading instruction and measurement of lipreading performance. O'Neill and DavidsonlSLl reported results of a study in which they concluded that training in the recognition of Simple forms of lip configurations might well be included in a regular method of lipreading training. This conclusion came from a demonstrated relationship:between lipreading performance and non-verbal concept formation. This suggests that the better lipreaders were able to utilize minute differences in lip configuration in lipreading and that these 151Stone, loc. cit. 152Fusfield, loc. cit. 153Lowell, "New Insights. . .," loc. cit. 154 O'Neill and Davidson, loc. cit. 123 should then be taught to lipreaders. The present study bears out the fact that these differences in facial move- ment are present on the speaker's face, even among so- called homophenous words, shedding some light on-a possible reason for the conclusion rendered by O'Neill and Davidson based on their results. 155 156 Jacoby and Harris have both suggested that the eye is capable of perceiving and utilizing minute differences in amplitudes and slight amounts of energy. Thus, there is every reason to believe that the eye is capable of detecting the differences in facial movement among homophenous words demonstrated by the present study, and those cues can be used to enable the viewer to become a more proficient lip- reader as suggested by O'Neill and Davidson. The results of this study are in agreement with those reported by Joergenson157 in that both found no significant differences in the time required to say homophenous words. The present study did find a consistent but non-significant trend in duration among certain homophenous clusters. Joergenson found a variation in the temporal pattern of lip movement during production of homophenous words, indicating that maximum lip movement occurred at earlier or later in- tervals for different homophenous words. .This is consistent 155Jacoby, loc. cit. 156Harris, 4p. cit., p. 45. 157Joergenson, loc. cit. 124 with the present study that showed differences in time elapsed to changes in movement patterns among the /p, b, m/ consonant cluster. Thus, the lipreader has temporal cues as well as intensity of movement cues to aid in making discriminations among so-called homophenous words. Wong and Fillmore158 suggested that vowel duration is a primary cue for auditory differentiation of similar word pairs such as 'his-hiss' when the final consonant is unvoiced as opposed to voiced. Logically, if duration contributes to auditory recognition of words, that duration should also be reflected in a change in the visual signal as well. It may well be that this vowel duration effect on voiced-unvoiced sounds is one cause of the differences noted among certain consonant clusters in the present study. Again, it provides an additional cue to the viewer for distinguishing among these words. It has been suggested by Woodward and Barber159 that visual discrimination between such words as 'bill' and 'pill' is not due to the voicing aSpect but is done on the basis of the context in which the word appears. The results of the present study suggest that such discrimination by viewers not only can be done on the basis of context, but also are aided by differences in the movements that appear on the Speaker's.face in.terms of duration, temporal pattern of 158Wong and Fillmore, loc. cit. 159Woodward and Barber, loc. cit. 125 the facial movement, and intensity of that movement that accompany the production of those words. Previous research has indicated that viewers can discriminate among homophenous words without the aid of contextual cues. The present study has demonstrated that with at least some homophenous consonants, there are differences in several parameters of facial movements that could hopefully be utilized to assist in such discriminations. Undoubtedly contextual cues are an important aid in lip- reading, but the evidence would seem to indicate that there are also other cues available to the lipreader and that this is possibly one of the differences between better and poorer lipreaders--that the better lipreaders have uncon- sciously learned these differences and utilize them in discriminating among homophenous or similar-appearing word pairs. CHAPTER V SUMMARY AND CONCLUSIONS As early as the 17th century there was demonstrated interested in the use of visual communication or lipreading as an assist to, or substitute for, normal auditory communication in the hard-of—hearing and deaf persons. In recent years there has been increasing interest in this mode of communication. The majority of published research in this area has dealt with the lipreader, or receiver, with relatively little being reported regarding the stimulus material or with the speaker as variables in the communication process. Bell160 introduced the concept of homophenous words in 1874, as words that look alike on the lips when Spoken. In following years, several writers published lists of homophenous words to be used for instruction in lipreading in the belief that students of lipreading must be aware of the possible confusion and misunderstanding that could result from the fact that many words look alike on the lips. Most writers in the field have supported the notion that the correct word of an homophenous group could be distinguished only on the basis of the context in which it appeared. 160De1amd, 02. cit., p. 120. 126 127 There has been controversy about homophenous words since they were first introduced. In 1902 Davidson161 expressed the view that these words were not exactly alike. Other writers since then have occasionally made tentative suggestions that lipreaders might be able to discriminate some of the words. In the late 1950's some research began to indicate a need for reclassification of homophenous words. Some investigators found that lipreaders could distinguish among groups of such words beyond chance levels. However, no studies had been done on a truly objective basis at the source of the message--the Speaker--where the words originate. Evaluation of motion picture films of speakers moved in this direction of objectivity but still remained somewhat crude and cumbersome. A new approach was needed that would remove more of the subjectivity from the research and examine the actual facial movements that occur during the production of so-called homophenous words. With regard to the Speaker variable, there has been general agreement that speakers differ widely in their respective lipreadability. Factors such as lip mobility, size and movement of the lips and jaw, gesture activity, facial expression, precision of articulation, and others have been suggested as causes of this speaker variability. Here too, however, most research has approached this 161Davidson, loc. cit. 128 problem from the lipreader's position. Some work with motion picture films demonstrated this variability. Still, a more objective approach has been needed that would examine speaker differences in lipreadability at the source of those differences--the speaker himself. It was the purpose of this study to investigate objectively the effect of speaker sex upon the amount and pattern of facial movement that occurs on certain areas of the face while producing homophenous monosyllabic words. Secondly, this study searched for differences in these facial movements that may exist among selected groups of so-called homophenous sounds as a function of the position of those sounds in a word, whether in initial or final position. The three homophenous consonant clusters /p, b, m/, /t, d, n/ and /tf, 455 f/ were chosen for study. A list of eighteen homophenous words was constructed for each cluster such that there were six words for each consonant, three with the consonant in initial position and three with the consonant in the final position. The total list of 54 words was separately randomized for each of ten subjects. A strain gauge was attached to certain areas of the face of the subjects so that facial movements in these areas increased the length of the gauge changing its electrical resistance. This change in resistance was amplified and recorded by an oscillograph as a graphic tracing on recording paper for each of the words. 129 Five male and female subjects spoke each word in the list five times. The three most similar tracings of each word were analyzed in terms of six measures. These were: (1) number of changes in direction of the tracing from positive to negative; (2) summation of time elapsed to each such change in direction of the tracing; (3) summation of amplitude of the tracing at those points; (4) an integration of time and amplitude at points of changes in the tracing; (5) total surface area under the tracing; and (6) total duration of the tracing. Analysis of the data indicated that there are differences between male and female speakers in intensity and pattern of facial movement during the production of mono-syllabic words. Male speakers showed greater intensity of movement and increased changes in time elapsed to changes in pattern of movement; and a greater number of these changes in the pattern of movement across the three clusters, when consonants under study occurred in the final position. Statistically significant differences were found among the consonants /p, b, m/ in all measures except total duration of the words. The unvoiced plosive consistently showed greater intensity of movement, more changes in movement pattern, and longer time to those changes than the nasal continuant, when these sounds occurred in the final position of a word. The voiced plosive also differed from 130 the nasal continuant in the final position in terms of intensity of facial movement and.in changes in movement pattern. The two plosives differed from each other in the final position in terms of the number of changes in facial movement patterns and intensity of facial movement at those points. All three of these sounds differed from each other in the initial position on one measure, that of intensity over total duration of the word. Here again, the order of the consonants was consistent with all other measures with the /p/ showing more facial movement than /b/ which in turn presented more movement than /m/. The analysis of the /t, d, n/ consonant cluster revealed no significant differences among these sounds across word position or in either initial or final position as determined by the six measures used in this study. The voiced and unvoiced affricates in the /tf, d3,f./ consonant cluster were found to present greater time elapsed to changes in movement pattern than did the unvoiced continuant, when these consonants occurred in the final position of a word. These findings were consistent with those of the /p, b, m/ cluster. No other differences were found in this group of sounds in terms of facial movement as determined by the measures used in this study. 131 Conclusions Within the limitations of the present study, the following conclusions appear to be warranted: 1. Male and female Speakers differ in the movement that appears on the surface of the face during the production of monosyllabic homophenous language units. These differences in facial movements would be expected to provide different sets of visual cues to the lipreader. Any test or evaluation of lipreading performance should take this into account and include speakers of both sexes for an accurate and adequate measurement of that performance. In addition, this finding has implications for the instruction of lipreading. Adequate lipreading instruction should include training with Speakers of both sexes, and attempt to teach the student of lipreading what differences might be expected between male and female Speakers. 2. It appears that the so-called homophenous consonants /p, b, m/ are accompanied by different intensities and patterns of facial movement for each of the respective consonants, especially in the final position of a word. Such differences in movements on the surface of the face during the production of words using those sounds can be expected to present different sets of facial cues to the lipreader. As a result lipreaders are not forced to rely on contextual and situational cues alone to discriminate among homophenous words using these sounds. They pan also be expected to receive some cues from the face of the 132 Speaker to assist in such discrimination. Consequently, words using these sounds Should be incorporated into lipreading training with the goal of attempting to help the lipreader identify differences among words using these sounds, as well as being aware of the possible confusions arising from.them. In addition, it may be wise to utilize such words in tests of lipreading performance as such items may well serwe to be highly discriminating items between the good lipreader who has incorporated the differences among these sounds into his lipreading performance, and the poor lip- reader who has not. 3. It would appear that the consonants /t, d, n/ are truly homophenous sounds, at least in so far as movements on those areas of the surface of the face as measured by this study are concerned. 4. The voiced and unvoiced affricates, /tf,