I v "7 'v-V " "v'- "‘ ‘I' v v '- V V- n». - ‘ V V v ‘ " ‘v ' "" "“o-a. o. O. a... or..~flwfi"‘~”-QOOOWU¢~ ‘ c ..... - . A PHONETIC ANALYSIS -OF “ME". . . - ‘- COMPRESSED one ,MONOSYLLAB‘L'ES . 2. . ' .- Thesis for the Degree of M; A. , . MICHIGAN STATE UNIVERSITY D. CREIG .DUNCKEL 1972 THESIS -Baux amnm mc HUAHSN' “38”" anans ”“3” II. n- m- MSU LIBRARIES “ RETURNING MATERIALS: PIace in book drop to remove this checkout from your record. FINES wiII be charged if book is returned after the date stamped beIow. Accepted by the faculty of the Department of Audiology and Speech Sciences, College of Communication Arts, Michigan State University, in partial fulfillment of the require- ments for the degree of Master of Arts. I ‘ l c! _. .1 .../‘ \ C /' -/ 1' {j N .-‘ , r I} 7’~'\‘,t( L figs-*3). {C ~L. JC" Daniel S. Beasley, Ph. D. . ,,/ 1 K ‘1 : \ I ((57 ,L// I] 7.; 1/“? r I 1"-(r(. C"; ,1“ £__“__’___ William F. Rintelmann, Ph. D. _ ‘l . I. ’ I ' _ r // ,xg/ }' 1/ ‘ ' V, I ./ V Thesis Committee: Director ,1.‘ May Chin ABSTRACT A PHONETIC ANALYSIS OF TIME-COMPRESSED CNC MONOSYLLABLES BY D. Creig Dunckel The purpose of this study was to investigate the effects the time-compression procedure had on the intel- ligibility of CNC monosyllables. The experimental stimuli utilized was list 1, form B, of the Northwestern University Auditory Test Number 6. The words in this list were time-compressed by 30% through 70%, in 10% steps, in addition to a 0% control condition. Compression was accomplished with the Zemlin modification of the Fairbanks Time-Compressor. The data for this study were gathered by Beasley gt_3l. (1972). Confusion matrics were made for the errors associated with each degree of time-compression (0%, 30%, 40%, 50%, 60%, and 70%) and each sensation level of presentation (8 dB, 16 dB, 24 dB, and 32 dB). Thus a total of 24 confusion matrices were recorded. These matrices were then condensed over all conditions of D. Creig Dunckel time-compression levels. They were further classified by sensation levels, with 8 and 16 dB combined and labelled as low sensation levels, and 24 and 32 dB com- bined and labelled as high sensation levels. In each matrix the following phonemes were considered: / p, b, t, d, k, g, f, v, 0, s, z, I, m, n, t}, d}, r, j, h, l, hw, 9, 1. The consonants were further classified according to the linguistic features of voicing, nasality, duration, and place of articulation. Results indicated that a consistency in phonemic errors does exist and may be due to a change in the speech signal by various degrees of time-compression. Further, these phonemic confusions differ depending upon the placement of the phoneme in the word i.e. initial vs final position. Different substitution patterns were exhibited for the same phoneme when in the initial position of the word as opposed to the final POSition. The sensation level of presentation also affected the type of phonemic errors. This was demonstrated by a / b / for / v / substitution at low sensation levels and / m / for / v / substitution at the high sensation levels. This study further revealed that the distinctive features of the phonemes were also affected by the D. Creig Dunckel time-compression procedure. Those phonemes with the greatest duration tended to be least affected by the various time-compression ratios. A PHONETIC ANALYSIS OF TIME-COMPRESSED CNC MONOSYLLABLES BY D. Creig Dunckel A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF ARTS Department of Audiology and Speech Sciences 1972 7 (if) II-/I>/I.i‘) ACKNOWLEDGMENTS I wish to extend my gratitude to Dr. Daniel S. Beasley, my thesis advisor, and to Dr. William F. Rintel- mann and Professor May Chin, the members of my committee, for their assistance in the preparation of this thesis. I would also like to extend my appreciation to Miss Shelley Schwimmer for the collection of the data used in this thesis. Last, but not least, I wish to acknowledge my wife, Priscilla, for her faith and endurance in helping me achieve my educational goals. ii TABLE OF CONTENTS ACKNOWLEDGMENTS . . . . . . . . . . . LIST OF TABLES . . . . . . . . . . . Chapter I. INTRODUCTION. . . . . . . . . . Controlled Time—Compression Procedures Phonetic Considerations in Perceptual Processing . . . . . . . . Duration of the Speech Signal . . Intensity of the Speech Signal. . Linguistic Features of Consonants. Summary and Statement of the Problem. II. EXPERIMENTAL PROCEDURES . . . . . . Subjects . . . . . . . .. . . Stimulus Generation . . . . . . Presentation Procedures . . . . . Analysis . . . . . . . Procedures for the Present Investi- gations . . . . . . . . . III. RESULTS . . . . . . . . . . . Low Sensation Level . . . . . . Initial Position . . . . . . Final Position . . . . . . . Combined Data of Initial and Final Phonemes . . . . . . . . High Sensation Level . . . . . . Initial Position . . . . . . Final Position . . . . . . . Initial and Final Positions. . . iii Page ii ll 14 15 18 20 20 20 21 22 23 25 25 27 29 30 33 36 37 37 Chapter Page IV. DISCUSSION 0 O O O O O O O O O O 0 4 2 Consistency of Phonemic Errors . . . . 43 Initial and Final Positions . . . . . 45 Effects of Time-Compression Levels . . . 46 Distinctive Features. . . . . . . . 49 Conclusions. . . . . . . . . . . 50 Implications for Further Research . . . 51 LIST OF REFERENCES . . . . . . . . . . . 53 APPENDICES Appendix A. Confusion Matrices for Phonemes in the Initial and Final Positions at Each Level of Time-Compression for Low (8 and 16 dB) and High (24 and 32 dB) Sensation Levels . . . . . . . . . . . 57 B. List I, Form B, Northwestern University Auditory Test, Number 6. . . . . . . 81 C. Distinctive Feature Classifications. . . . 82 iv Table 1. LIST OF TABLES Phonemic Confusions for Phonemes in the Initial Position Over All Conditions of Time-Compression at Low Sensation Levels . . . . . . . . . . Phonemic Confusions for Phonemes in the Final Position Over All Conditions of Time-Compression at Low Sensation Levels . . . . . . . . . . Phonemic Confusions for Phonemes in the Initial and Final Positions Combined Over All Conditions of Time-Compression at Low Sensation Levels . . . . Percentage of Correct Perception of Distinc- tive Features at the Low Sensation Levels . . . . . . . . . . Phonemic Confusions for Phonemes in the Initial Position Over All Conditions of Time-Compression at High Sensation Levels I O O I O O I O O O Phonemic Confusions for Phonemes in the Final Position Over All Conditions of Time-Compression at High Sensation Levels . . . . . . . . . . Phonemic Confusions for Phonemes in the Initial and Final Positions Combined Over All Conditions of Time-Compression at High Sensation Levels . . . . Percentage of Correct Perception of Distinc- tive Features at the High Sensation Levels . . . . . . . . . . Page 26 28 31 32 34 38 39 41 Table 9. A-3. A-lO. A-ll O Phonemes Most Frequently Missed at the 70% Time-Compression Position. . . Phonemic Confusions Initial Position at Low Sensation Phonemic Confusions Level in the Initial for Phonemes in the at 0% Time—Compression Levels. . . . . . for Phonemes in the Final Position at 0% Time-Compression at Low Sensation Phonemic Confusions Initial Position at Low Sensation Phonemic Confusions Levels. . . . . . for Phonemes in the at 30% Time-Compression Levels. . . . . . for Phonemes in the Final Position at 30% Time-Compression at Low Sensation Phonemic Confusions Initial Position at Low Sensation Phonemic Confusions Levels. . . . . . for Phonemes in the at 40% Time-Compression Levels. . . . . . for Phonemes in the Final Position at 40% Time-Compression at Low Sensation Phonemic Confusions Initial Position at Low Sensation Phonemic Confusions Levels. . . . . . for Phonemes in the at 50% Time-Compression Levels. . . . . . for Phonemes in the Final Position at 50% Time-Compression at Low Sensation Phonemic Confusions Initial Position at Low Sensation Phonemic Confusions Levels. . . . . . for Phonemes in the at 60% Time-Compression Levels. . . . . . for Phonemes in the Final Position at 60% Time-Compression at Low Sensation Phonemic Confusions Initial Position at Low Sensation Levels. . . . . . for Phonemes in the at 70% Time-Compression Levels. . . . . . vi Page 47 57 58 59 60 61 62 63 64 65 66 67 Table Page A-12. Phonemic Confusions for Phonemes in the Final Position at 70% Time-Compression at Low Sensation Levels . . . . . . 68 A-13. Phonemic Confusions for Phonemes in the Initial Position at 0% Time-Compression at High Sensation Levels. . . . . . 69 A-l4. Phonemic Confusions for Phonemes in the Final Position at 0% Time-Compression at High Sensation Levels. . . . . . 70 A-lS. Phonemic Confusions for Phonemes in the Initial Position at 30% Time-Compression at High Sensation Levels. . . . . . 71 A-16. Phonemic Confusions for Phonemes in the Final Position at 30% Time-Compression at High Sensation Levels. . . . . . 72 A-l7. Phonemic Confusions for Phonemes in the Initial Position at 40% Time-Compression at High Sensation Levels. . . . . . 73 A-18. Phonemic Confusions for Phonemes in the Final Position at 40% Time-Compression at High Sensation Levels. . . . . . 74 A-l9. Phonemic Confusions for Phonemes in the Initial Position at 50% Time-Compression at High Sensation Levels. . . . . . 75 A-20. Phonemic Confusions for Phonemes in the Final Position at 50% Time-Compression at High Sensation Levels. . . . . . 76 A-21. Phonemic Confusions for Phonemes in the Initial Position at 60% Time-Compression at High Sensation Levels. . . . . . 77 A-22. Phonemic Confusions for Phonemes in the Final Position at 60% Time-Compression at High Sensation Levels. . . . . . 78 A-23. Phonemic Confusions for Phonemes in the Initial Position at 70% Time-Compression at High Sensation Levels. . . . . . 79 vii Table Page A-24. Phonemic Confusions for Phonemes in the Final Position at 70% Time-Compression at High Sensation Levels . . . . . . 80 viii CHAPTER I INTRODUCTION Audiological evaluation techniques have concerned themselves primarily with detection of lesions in the peripheral auditory system and in the eighth cranial nerve. Recently, however, there has been an increasing interest in detection of lesions in the brain-stem and auditory cortex. The structure and sensitivity that is required of a test to measure lesions in the higher audi- tory centers, however, are lacking in present conventional tests. Individuals with disorders of the higher auditory pathways usually exhibit response behavior which appears essentially normal (Willeford, 1969; Katz, 1969). The function of higher auditory centers is to organize simul- taneous or successive elements of the acoustic speech signal into a definite pattern (Bocca and Calearo, 1963). It is recognized that the central auditory pathways pro- vide a sufficient degree of intrinsic redundancy, even when damaged, to allow simple psycho-acoustic elements to satisfactorily reach the cortical integrative and interpretive centers (Bocca and Calearo, 1963). Attention has therefore become centered on tests involving verbal stimuli which have been altered to render comprehension more difficult by adequately taxing this intrinsic redun- dancy by decreasing extrinsic redundancy. In an effort to produce verbal stimuli which renders comprehension more difficult, stimulus distortion, such as described by Fairbanks, Everitt, and Jaeger (1954), has been used. A study by Beasley, Schwimmer, and Rintel- mann (1972) using ninety-six right-handed normal hearing young adults, sought to isolate the effects of varying degrees of time-compression on monosyllabic word intel- ligibility and to examine them with respect to intensity increases in the speech stimuli. Their study provided the groundwork for further studies, including investigations of auditory pathologies. The results of this study showed that increases in time-compressed CNC monosyllables to 60% resulted in gradual decreases in intelligibility. At 70% time-compression a sharp decline in intelligibility occurred. Also, intelligibility increased as sensation level increased for 8 dB SL to 32 dB SL. The results of this study tend to show that time-compressed speech does render comprehension more difficult. Beasley et_§l, suggested that such stimuli may assist in diagnosing higher auditory pathway lesions. Fournier (1954) stressed the importance of the time factor in speech discrimination and related the increase in time required by the cortex for identification of a message to the difficulties in speech perception experienced by the elderly. Bordley and Haskins (1955) demonstrated that an increase in rate of presentation of the word "stimulus" resulted in more difficulty in com- prehension. Calearo and Lazzaroni (1957) did a study of precise discrimination scores of normal and presbycusic subjects, varying the factors of intensity and syllabic rate of "short, significant sentences" as stimulus material. They demonstrated that in normal subjects, an increase in syllabic rate is almost completely neutralized by a simultaneous increase in intensity. When the accelerated sentence material was presented to the group of presbycusic subjects, however, threshold shift was increased as much as 30 dB for the intermediate speed, while a threshold of speech perception could not be obtained at any intensity level for the highest speed. Furthermore, it was reported that when the accelerated sentences were presented to a group of subjects with temporal lobe lesions, these subjects yielded poor articulation curves when the speeded message was sent to the ear contralateral to the lesion. de Quiros (1964) used accelerated speech material to test 20 normal subjects, 15 subjects with peripheral hearing losses, seven presbycusics, and several groups of adults and children with central disorders. The speech material used in his study were "abstract" sentences of approximately ten words for adult subjects, and "concrete" sentences of similar length for use with children. The subjects' articulation scores were considered in relation to the shape of the articulation curve, Speech detection threshold, speech reception threshold, and maximum articu- lation score. Results obtained with the groups of sub- jects with CNS disorders indicated that in differential diagnosis, accelerated speech testing provided additional information, which, when correlated with other findings, may aid in pinpointing sites of brain lesions, especially within the temporal lobe. However, because of the small number of subjects considered with CNS disorders, the consistency of re5ponses from subject to subject within each category and the differential responses of subjects in each of the categories cannot be considered definitive on the basis of this study. Controlled Time-Compression Procedfires Fairbanks, Everitt, and Jaeger (1954) produced a controlled electromechanical procedure for compressing speech stimuli. The procedure involved passing a tape over the curved surface of a cylinder and wrapping it around the cylinder enough to make contact with one-quarter of its circumference. This was accomplished by four tape reproducing heads equally spaced around the circumference of the cylinder. When this cylinder is stationary, and the tape is moving at the same speed at which it moved during recording, it makes contact with one of the repro- ducing heads and the signal is reproduced as recorded. When an adjustment is made for a certain amount of com- pression, the speed of the tape increases and the cylinder begins to rotate in the direction of the tape motion. Under conditions of time-compression, each of the four heads makes, and then loses, contact with the tape loop. Each head reproduces, as recorded, the material on that portion of the tape with which it makes contact. When the cylinder is so positioned that one head is just losing contact with the tape while the preceding head is just making contact with the tape, the segment of the tape that is wrapped around the cylinder between these two heads does not make contact with a reproducing head, and is therefore not reproduced. The information not reproduced is referred to as the interval of discard, and was found to be maximally effective when it was 15 to 20 m/sec in length (Fairbanks and Kodman, 1957). The amount of speech compression is dependent upon the number of discard intervals (non-reproduced segments) per unit of time. The sampling interval (Ia) is composed of the discard interval (Id) and the recorded interval (Ir), such that Ia=Id+Ir The sampling frequency (Fs), i.e., the rate at which the input signal is sampled, is Fs = l/Ia The compression ratio (Rc) is then defined as RC = Id/Ia = IdFs Both Rc and Id are independently manipulated since, on the Fairbanks apparatus, the tape and cylinder speed are independently variable. This allows for vari- ation of the temporal value of the discarded portions of the message. Using this electromechanical apparatus, Fairbanks, Guttman, and Miron (1957) presented a pair of independent message-test units to normal hearing subjects. Each con- sisted of an extended exposition of technical information and a corre3ponding test of factual comprehension. The messages were read and recorded approximately 141 me and time-compressed electro-mechanically by various amounts. Independent groups of subjects were assigned to five experimental conditions which represented a series of compressions from 0% to 70%, and to a sixth test condition in which no message was presented. Listener aptitude was controlled by forming subgroups of approximately "equal" aptitude for each condition at four different levels. The effect of the message-test difficulty was assessed by sub-scoring resulte according to five message effec- tiveness levels, based upon differences in responses to test items in the 0% compression condition and the test- only condition. The curve of comprehension as a function of message time was found to be Sigmoid. Response scores were approximately 50% of the maximum when the message was compressed by 60%, and slightly less than 90% of the maximum when the message was compressed by 50%. It was concluded that the interaction of time-compression and message effectiveness significantly affected comprehension of factual material. To isolate the effect of reduced stimulus duration on intelligibility Fairbanks and Kodman (1957) presented Egan's phonetically balanced (PB-50) word lists to highly trained listeners. The words were presented at a constant intensity at varying ratios of time- compression. The results of their study demonstrated that time-compression, up to 20% of the original signal duration, had no significant influences on intelligibility, when the stimulus words were presented at a comfortable listening level. It has been demonstrated that the PB-50 word lists reflect low reliability and a wide range of word dif- ficulty among the test items (Eldert and Davis, 1951). Although the W-22 word lists were devised (Hirsh eE_al., 1952) in an attempt to overcome the reliability problems in discrimination testing, it was not until the develop- ment of the Northwestern University Auditory Test No. 6 (NU-6) that the latter difficulty was adequately solved (Tillman and Carhart, 1966). The development of this test was based upon studies performed with an earlier test, the Northwestern Auditory Test No. 4 (NU-4) developed by Tillman, Carhart, and Wilbur (1963). This test was used extensively in the Auditory Research Laboratories at Northwestern University for a two-year period. It proved to be a valuable tool in the measurement of speech discrimination. The NU-4 consisted of six randomizations of two 50-word lists composed of phonemically balanced monosyllables of the consonant-nuclens-consonant (CNC) variety, selected from a pool of such words compiled by Lehiste and Peterson (1959). In addition, the 50 words in each list contained a proportional distribution of phonemes in the Thorndike and Lorge (1952) word list. It was found that even with six equivalent forms of each list, the investigation of a large number of listening conditions could not be accomplished without several repetitions of the various forms and lists. Because of this limitation the NU-4 was revised and expanded into the NU-6, which consists of four randomizations of four phonemically balanced word lists, each composed of 50 monosyllables. From the studies of Tillman and Carhart (1966), it was found that NU-6 compares favorably with the NU-4 in interlist equiva- lence and test-retest reliability. Further as with the NU-4, subjects with conductive hearing losses yielded articulation functions closely duplicating normal sub- jects, whereas, subjects with sensori-neural impairment yielded more gradually rising articulation functions and lower mean maximum articulation scores. The NU-6, there- fore, satisfies the two basic requirements of a diagnos- tically useful test of auditory discrimination: (a) a substantial segment of the articulation function, depicted graphically as an articulation curve, is linear, so that the value of its slope may be precisely measured, and (b) the slope of the articulation function has the potential to diagnostically differentiate among various auditory pathologies. Phonetic ansiderations in PerceptuaI’ProcesSing The acoustic message is correlated to sounds which can be characterized by their amplitude and fre- quency in the case of a pure tone, or by the amplitude and spectrum in the case of a complex sound. The 10 physical quality of frequency corresponds to the sen- sation of pitch, the amplitude to the sensation of loud- ness, and the spectrum to the sensation of timbre or quality. While a sound must be perceived to be identified, it is by no means certain that once a sound and/or word is perceived, it will be identified. Thus the object of vocal audiometry is to study the intelligibility of speech. As Malmberg (1970) points out, the clarity of the speech message is very important for the identifi- cation of the message. The intensity has a very definite effect on the clarity. Experiments carried out with a message spoken at normal intensity and then played back with various attenuations show that discrimination is optimum between 60 and 70 dB SPL, and becomes more diffi- cult above 80 dB SPL (Licklider and Miller, 1938). Lick- lider and Miller (1938) also pointed out that information pertaining to the duration of the message and the dis- tortion involved in the production of the message are essential in determining the intelligibility of the message. "Speeded-up" speech thus reduces the possi- bilities of identification even if the frequencies involved are not essentially modified. A form of measuring distorted speech is via fil- tering. The first studies of clarity as a function of ll distortion in the form of filtering were carried out by Fletcher (1929). He showed that in order to retain 50% of the phonetic clarity the cut-off frequency of a filter should not exceed 1200 Hz for a high pass filter, or be less than 1700 Hz for a low pass filter. Lafon (1961) showed that the replacement of one speech sound by another in a discrimination task is not a random process, but depends on the significant acoustic feature in question. Contoids are more often incorrectly heard than vocoids. As regards the plosives, / k / is usually mistaken for a / t /, or sometimes for a / p /. The / p /, whose explosion is less accentuated may be mistaken for the corresponding voiced contoid, / b /. The / r / may be confused for all voiced contoids without any particular preference except perhaps for / l /; but practically never replaces another Speech sound. The / l / and the / m / are most often replaced by / n /, and / j / is mistaken for / l / and / n /. Further, Lafon (1961) found that the labio-dental / f / is most often heard incorrectly and the sibilants are most often mis- taken for one another, typically in the form of voiced for voiced and unvoiced for unvoiced confusions. Duration of the Speech Signal. The duration of the speech signal helps to distinguish certain phonemes from other phonemes. For example, the plosive consonants / P: t, k, b, d, g, / are characterized by a burst. 12 Further, the voiced plosives / b, d, g / consume more time than their voiceless counterparts / p, t, k /. Fricatives and affricatives / f, O, s, f, tf, v,5 , 2,} , and d7 / are distinguished by friction and relatively greater duration than plosives. All durational phenomenon can be demonstrated spectrographically (Jakobson et_§l., 1952). The durational aSpect of the consonants contributes to distinguishing plosives from fricatives. Voiceless plosives require a minimum of temporal clues for recog- nition, followed by the voiced plosives, slit fricatives, and the grooved fricatives, respectively. Black and Singh (1970) reported that the errors made by a panel of listeners, when increasing amounts of the initial portions of consonants were removed, consisted of / f / and / v / being confused as / b /, / s / as / t /, / z / as / d /, / f / as / t /, and 2’? / as / d /. In contrast, Tiffany (1953) compared the intel- ligibility of sections of vowels of 0.08, 0.2, 0.5 and 8.0 sec, and concluded that added duration, beyond a "natural duration" in speech does not contribute to recognition. This result was in agreement with Siegen- thaler's (1950) findings that ten "long-sustained" vowels were only 50% recognizable. Miller and Licklider (1940) using a method of deleting the alternate sections of recorded speech 13 without affecting the over-all duration of the passage demonstrated that much of the continuous time consumed by words can be deleted without seriously impairing the perception of the words. They varied both the frequency and the duration of the interruption of the signal and used recognition of monosyllables as a criterion measure. With one interruption per second and with the interruption lasting 0.5 sec, the words were 40% intelligible, and when the interruption was 0.25 sec they were 80% intel- ligible. With ten interruptions per second and only 25% of the word remaining, the reception score was 60%, thus showing that as more of the speech signal was removed, intelligibility decreased. Further, Garvey and Henneman (1952) found that more than 60% of the speech pattern had to be removed before intelligibility decreased below 80%. It was suggested by Daniloff, Shriner, and Zemlin (1968) that at high compression ratios, a normal listener may not have enough time or information to perceptually process incoming verbal stimuli correctly thus lowering the intelligibility of the signal. Neither of these studies attempted to make an analysis of the errors made by the listeners. In essence, the duration of the signal, whether it is lengthened or shortened, plays a role in the listener's ability to integrate the signal in the higher auditory pathways. Many authors (Fletcher and Steinberg, 14 1929; Black, 1952; Stevens, 1946; and Miller, Heise, and Lichten, 1951) have found that accurate perception of a spoken word also depends to a considerable extent upon the frame of expectations within which a word occurs and the length of the signal. Thus, nonsense words may be less intelligible than meaningful units and monosyllables may be less intelligible than polysyllabic units and/or sentences, up to a point (Beasley and Shriner, 1972). Intensity of the Speech Signal. The intensity or the level of presentation of the signal plays an important role in the listener's ability to perceive and interpret the signal. Beasley, Schwimmer, and Rintelmann (1972) presented time-compressed stimuli at sensation levels of 8, 16, 24, and 32 dB and demonstrated that as the sensation level of presentation increased the intelli- gibility of the stimuli increased. Further, the largest increase in intelligibility due to an increase in the intensity of the signal occurred between 8 and 16 dB sensation level. Miller and Nicely (1954) used six different signal to noise ratios ranging from -12 to +12 dB in 6 dB steps. They demonstrated that as the S/N ratio increased, the intelligibility of the stimuli increased. Stevens (1938) investigated psychophysical data for pure tones as related to intensity, but unfortunately there is practically no psychophysical data relating the 15 loudness of speech sounds to their intensity except for some general observations made by Fletcher (1929). He found that sounds with a large number of frequency com- ponents increase more rapidly in loudness with a rise in intensity than do sounds with fewer components. Further, he noted that sounds with most of their energy in the low frequency region increase more rapidly in loudness than those with high frequency energy concen- trations. Thus the loudness of any complex sound will depend upon its frequency components with the vowels being perceived louder than consonants because of the point of their energy concentration. That is, words that have sounds with most of their energy in the low frequency region may be perceived as louder, thus increasing their intelligibility, than words whose sounds have energy in the high frequencies. Linguistic Features of Consonants. There are many ways of classifying phonemes according to features and the articulation process used to generate these sounds. These features help the listener to distinguish one phoneme from another and thus enhance the intelligibility of the phoneme. According to one theory, in order to decode the message, the receiver extracts the distinctive features from the acoustical data (Jakobson and Halle, 1970) in order to make perceptual decisions. 16 Distinctive features can be broken into two classes: the prosodic features, including force, quality, and tone, and the inherent features which are divided into three subclasses: sonority, protensity, and tonality. Sonority, protensity, and tonality cor- respond to the prosodic features of force, quality, and tone (Jakobson and Halle, 1970). The sonority features are classified in the following manner: (1) vocalic/ non-vocalic, (2) consonantal/non-consonantal, (3) nasal/ oral, (4) compact/diffuse, (5) abrupt/continuant, (6) strident/non-strident (mellow), (7) checked/unchecked, (8) voice/voiceless. The protensity feature is tense/lax and the tonality features are grave/acute, flat/non-flat, and sharp/non-sharp (Jakobson and Halle, 1970). Miller and Nicely (1955) in an experiment using English consonants transmitted with frequency distortion and random masking, confirmed that the perception of each of these features is relatively independent of the perception of the others. This study involved 16 con- sonants followed by the vowel / a / as in "father." These phonemes were recorded under six conditions of signal-to-noise ratios beginning with -12 to +12 dB, in 6 dB steps. Different frequency responses were measured at the +12 dB S/N condition. In this study the 16 consonants were classified according to the following linguistic features: 17 (l) voiced/voiceless, including / b, d, g, v,’5 , z, ‘3, m, n, / as voiced phonemes and / p, t, k, f, 0, s, f. / as unvoiced phonemes; (2) nasality, including / m / and / n /; (3) affriction, classified according to articulatory production, including / f, O, s, f, v,"6, 2,], /; (4) duration, whereby / s,/’, z,}', / were separated from the other 12 consonants because of the former's added duration and (5) place of articulation, according to tongue place- ment in front, middle, or back of the oral cavity, with / p, b, f, v, m, / considered as front, / t, d, 0, 5,5, 2, n, / as middle, and / k, g, f,’;, / as back consonants. The Miller and Nicely (1955) study showed that these five linguistic features could be effectively used for grouping consonants, and that their distinctive features do aid in the identification of one phoneme from another under varying signal-to-noise conditions. The results indicated that voicing and nasality are much less affected by a random masking noise than the other features. The results for affrication and duration were very similar and superior to place of articulation for auditory discrimination, but inferior to nasality and voicing. Place of articulation was difficult to 18 discriminate at ratios less than +6 dB whereas nasals and voicing could be discriminated at signal-to-noise ratios of -12 dB. §ummary and Statement of the ProbIem In summary, a review of the literature suggests that time-compressed speech stimuli may have value as a diagnostic tool for identifying lesions of the higher auditory pathways (Bocca and Calearo, 1963; Calearo and Lazzaroni, 1957; deQuiros, 1964). However, several studies reveal that when distortion of the speech signal occurs by either altering the duration of the signal and/or the intensity of the signal, certain phonemes may be confused with other phonemes (Lafon, 1961; Jakob- son g£_al., 1952; Black and Singh, 1970). The purpose of this study was to analyze those errors in the form of perceptual phonemic confusions in response to stimuli time-compressed 0%, 30%, 40%, 50%, 60%, and 70%. Further, the effect of the linguistic features of voicing, nasality, duration, and place of articulation determining the intelligibility of the speech signal was investigated. In addition, perceptual confusions associated with low and high sensation levels of presentation and their effect upon phonemic confusions and signal intelligibility was studied. 19 Specifically, the following questions were inves- tigated relative to the responses of normal hearing, young adult subjects to a standardized, monosyllabic Word list: 1. Is there a consistency in phonemic errors due to a change in the speech signal by various degrees of time-compression (30% to 70% in 10% steps)? Do these errors differ when the phoneme is in the initial or final position of the word? Do the type of phonemic errors differ as a function of sensation level? How are the various distinctive features affected as a result of time-compression? CHAPTER II EXPERIMENTAL PROCEDURES The data for this study were obtained from a study by Beasley, Schwimmer, and Rintelmann (1972). Subjects The subjects were 96 normal hearing right-handed young adults selected from a university population. These subjects were randomly assigned to six groups of sixteen each. Each subject was required to pass a sweep frequency screening test presented at a Hearing Level of 22 dB (re: ISO, 1964 Standard) at octave intervals ranging from 125 Hz to 8,000 Hz to insure normative status of hearing bilaterally. Also, a live-voice presentation of the CID W-l Word List was administered unilaterally to obtain the Speech Reception Threshold (SRT) for the designated test ear. Stimulus Generation The experimental stimuli used in this study were the four lists of From B of the NU-6 (Tillman and Car- hart, 1966). The four-word lists were recorded at normal 20 21 conversational speech and effort level by a trained white male talker who spoke General American English under controlled recording procedures (Rintelmann and Jetty, 1968). An Ampex Model 601 tape deck (frequency response 50-12,000 Hz :2 dB) and an Ampex Model 600-2 tape deck (frequency response 50-13,000 Hz, :2 dB) were used to make copies of each of the four recorded lists. The copies were then temporarily processed using the Fair- banks electromechanical time-compression apparatus (Fairbanks, Everitt, and Jaeger, 1954), as modified by Zemlin (1971). Each list was time-compressed by 30%, 40%, 50%, 60%, and 70%, and was also passed through the time-compression mechanism under 0% in order to con- trol for possible fidelity distortion when using the tapes. In all, there were six time-compressed recordings for each of the four lists, resulting in 24 experimental tape recordings. The experimental tapes were copied using an Ampex Model 601 tape recorder and an Ampex AG 500-2 (frequency response 50-13,000 Hz, :2 dB) monitored by an Ampex AA 620 power amplifier. Presentation Procedures The 96 subjects were divided into six groups, corresponding to the six different percentages of time- compression under study. Each subject within a single 22 group was presented with the four lists of Form B of the NU-6. Each list was presented at one of four sensation levels: 8 dB, 16 dB, 24 dB, and 32 dB. The order of presentation of the four sensation levels was rotated within each group. In this manner, each test list was presented a total of four times for each time-compression condition at each sensation leVel, and the sensation levels were counterbalanced to avoid possible order effects of sensation level presentation. A prefabricated double-walled test chamber (IAC 1200 series) was used in testing each subject individually. There was no interference from ambient noise in the test room since it was sufficiently low (45 dB on the C-scale of a Bruel and Kjar sound level meter) so as not to inter- fere at even the lowest sensation level. Analysis Each subject's response was recorded on an answer sheet which the experimenter hand-scored. The data were then converted to percentage correct scores and were plotted as articulation curves. The slopes of these articulation functions were calculated for each con- dition of time-compression. In addition, graphic data were computed for sensation level and respective inter- actions studied. 23 Procedures for the Present InvesEIgations Phonemic confusion matrics were made using list 1, Form B, of the N.U. Auditory Test No. 6. This list was used because of its difficulty when compared with the other lists (2, 3, and 4) of Form B (Jetty and Rintel- mann, 1968). Further, it was felt that list 1, because of its difficulty, would be more representative of the errors made by the subjects. Confusion matrics were made for the errors associated with each degree of time- compression (0%, 30%, 40%, 50%, 60%, and 70%) and each sensation level of presentation (8 dB, 16 dB, 24 dB, and 32 dB). Thus a total of 24 confusion matrices were recorded. These matrices were then condensed over all conditions of time-compression levels. They were further classified by sensation levels, with 8 and 16 dB combined and labelled as low sensation levels, and 24 and 32 dB Combined and labelled as high sensation levels. These matrices were further subdivided according to 20 initial and 19 final positions of the phonemes under study. In each matrix the following phonemes were considered: / p, b, t, d, k, g, f, v, G, s, z, I, m, n, tf, dz, r, j, h, l, hw, g , /. If the subject made no response, this was recorded in the data. The consonants were classified according to the linguistic features of voicing, nasality, duration, and place of articulation, in the manner described by Miller 24 and Nicely (1955). / b, d, g, v, 2, d}, m, n,o , r, l, hw, j, / were classified as voiced, and / p, t, k, f, O, s, f, tf, h, / as voiceless. / m, n, and g / were classified under nasality. Consonants classified as longer in duration were / s,}', 2, d7, /} Consonants were also classified as either front, middle, or back under place of articulation. The front consonants were / p, b, f, v, m, /; the middle consonants were / t, d, O, 8, z, n,:), j, tf, /; and the back consonants were / k, g, f, (y , r, l, h, hw, /. The phonemic confusion matrices were computed according to the above classification sys- tem. CHAPTER III RESULTS The results of this study show certain consisten- cies associated with confusions at varying time-compression levels. It further reveals these confusions may differ depending upon the position of the phoneme in the word, i.e., initial vs. final. Further, the results show that the lower sensation levels resulted in more errors. The results further reveal that phonemes with certain dis- tinctive features were more intelligible than others. Tables 1 through 8 show the errors made at the various time-compression levels (0% through 70%). They further depict these errors by phoneme placement in the word (initial placement, final placement, and initial plus final placement). Also recorded in the tables are the percentage of time the phoneme was perceived cor- rectly and the percentage of time it was perceived as another phoneme where applicable. Low Sensation Level Tables 1 through 3 reveal errors in the form of phonemic confusion matrices for the stimulus items 25 26 uoouuou u U “cowusufiumnsm u m “uncommon 02 u m: He a mum, .wmul mi m m 3 4m oH. NEH m «1741. iIH. m m m H Hm mm m RAH H mH H H H$ ma N He H m H H .Ilmwxmw. immr. mmH H m H .w mm m. H em H m H v H an .Ilmm .d» In « em w. m mu mm s H H ms m yimv H m “a Nu m n. a. «H v vdH H H v H m NH em H H. mm H 1 mm m H m m m omH e H m m mm m m H H mm H N e m o IIHMWIHQnHWIIIIIIMIIIH H Imwrm NH H H H o~ IIHHWIJmnmw .11w: H .ltmw H mm m mm mm m H m H H H mm m om H m mm m H H e H m H H o no OH oH N m m we HH H H m H m H m m MHH u so mm.m «. m .a H1, H1, .MHna mH .ml .am m u inlaw .IdH run. H HI.H, .H. H H, .N oH.m m «OHIMI, n we mm H H o H SH H e m MOH a w m mz an H a A u no mu a s m a m m > u m H c u n a .mao>ma sowummcmm 30H um scammmnmsoouoswu mo msowuwccoo Ham uo>o cowuwnom Hmwuwcfl on» cw mosmconm How mCOHmsmsoo oafimsosmul.a flames 27 presented at the low sensation levels (8 and 16 dB) averaged over all conditions of time-compression. Comparison of Table 1 (initial errors) and Table 2 (final errors) shows that overall, there were more final position phonemic confusions than initial position confusions. Further, reference to Tables 1A through 12A in Appendix A reveals that the major increase for initial position errors did not occur until 70% time-compression, whereas for final position errors, the major increase was at 60% time-compression. Initial Position. Reference to Table 1 shows that the / f/ and / tf / phonemes were perceived correctly most often (97% and 90%, respectively), whereas the voiced labia-dental / v / was perceived correctly least often (25%). The most common substitution for / v / was the voiced bilabial / b /. The remaining phonemes have per- centage correct scores from 54% to 88%. The phonemes / p, t, g, v, h, and r, / revealed the highest degrees of consistent errors and suggest that most of the sub- stitutions were ones associated with voicing, whereby a voiced phoneme was replaced by a voiceless phoneme and vice versa, i.e., b/f and g/hw. This was followed by errors associated with place of articulation (t/k and g/d). The b/v substitution, while not Specifically uowuuoo u U «sowusuwumbsm u m «uncommon oz u mz 28 SS S SoN SH N S m H H H H SS S .me. HI N S S H mm SS H H N S H um SN S SN S H N S Am» SS H NS H N H H SS SH H H S SSH NH N S S N N S HN SH N a SS Sm S m H H N SS H H H S N N S H N a SS S N SS H N S SN S S H SN S N H N N SS S N H SNH S S a SS Sm S H H SN N S m .H H S HS N N H H mm H H H p NS mm N N N H H SS H H S S SS S HN S H H H H N S N H SS N S S m SN SH m m H H H SSH SH N HH Nm Sm S N SH N H H N S m 1m. SH SS SS N S SS SH H H N H S N S S H NN S ooN N SH S SS SW S S SH m H S NH HN pH S H N H S m N «HI; N SH H NoH w H N mz H u Ne S» m a s m N m S > S m H S u n m u S .mHm>mH sowummcmm 30H um sowmmmumEOOIoEwu mo mcowuflc tsoo HHS um>o sowvamom Hocwm can SH nmfimsosm you msowmcwsoo oafimconmll.m names 29 fitting the analytic paradigm utilized, should be con- sidered a place error, i.e., from a labio-dental to bilabial substitution. Final Position. All phonemes appear to be less stable in the final position. Whereas the phoneme / f / was perceived correctly 97% in the initial position, it was perceived correctly 83% in the final position. Further, no phonemes were perceived correctly 90% or more in the final position. In reference to Table 2, the nasal /9 / was per- ceived correctly the largest percentage of time (88%). In contrast the voiced lingual-dental plosive / d / was perceived correctly least often (32%) and was replaced by its voiceless counterpart / t / 30% of the time. Further, the phoneme / g / was perceived correctly 35% of the time and was most often substituted with the bilabial nasal / m / (27%). The voiceless labial-dental fricative / f / was substituted by the voiceless plosive / p / 37% of the time and was perceived correctly 47%. Again, voicing and place of articulation appear to play a major role in determining substitution patterns. It should be noted that in the initial position, the phoneme / v / had the lowest percentage correct score (25%), whereas in the final position it was correctly perceived 81% of the time. 30 The phonemes / t / and / p / tended to be sub- stituted most often for other phonemes in the final position. This was not the case in the initial position where the phoneme / b / was substituted most often. Combined Data of Initial and Final Phonemes. Table 3 Shows the errors made when the initial and final positions are combined. The scores of this condition fell between those scores obtained when the initial and final positions were looked at separately. This becomes important when considering the total intelligibility of the / v / and / f / phoneme. The sibilent phonemes appeared to be the most stable in the combined initial and final position. They were perceived correctly over 80% of the time under all conditions. In reference to Table 4, again it is the sibilents which are characterized by their durational aSpectS were perceived correctly the most often 85%. In contrast the voiceless labio-dental / f / was least stable (51%). When viewing Table 4, the voiceless phonemes when considered together at low sensation levels were perceived correctly 73% of the time. The voiced and voiceless bilabial plosives / b / and / p / were most often substituted for the / f / phoneme (21% and 18% respectively). The phoneme / v / was correctly perceived 3]. 300.38 I U SONS-53.4.5 I m Nuisances on I an HS N SHH N HH H N N SS SH SSN S NH H N N N S N S N H S H HS mm S NHH H SH H H no N .3 A n H n SN NH SN H N H S N S H u SS N H N H N N S S H as SS N HSH S H N S N N mu SS H SNH NS H N H H a SS HN _N S S NNN SH N H S N N N SN NH H S SS NH H S N H HH SSH H H S N S S N SH NH N “— NS H S H H N SSH N N m— NN N S H SN S N N H N A SS NH N N N N N SSN S N N N A; SS N H N H H N SS N N S S H. S “a SS Hm S H H H N S HS H H N N SN p HS mm S S H N H N NS N H S HS SN S— SS Sm S N N H H S N N H H NNH S SN S H N m— HN NN H H S S N N N N N S SSN SH SN S SH H— SS SSH NH N. H N H NH N H H N N N H SH SH SH NS S N L NS Sn NN H S N H N H H S N N N N SS N SSN N NH S SS SH N H H H H S SH H N H N SH S S S HNH N HN NH H H N H S N NH N SH N SH S SSN w m an an H a S u SH. 3 n a I S S S S p N u H u a n .— .quSIH 833:0. )3 us sou-song -33 No 33338 H- 3:6 Sofie-8 3333a H33 3.. H133 I5 3 Sol-:83 8S 33338 38......” 53A 32 TABLE 4.--Percentage of correct perception of distinctive features at the low sensation levels. +8 and 16 dB SL All Time-Compression Levels Number of Times Feature Correctly Total - % Perceived Nasal 459 672 68% Voiced 1924 2640 73% Unvoiced 1545 2112 73% Duration 571 672 85% Place of Articulation Front 663 1056 63% Middle 1295 1824 68% Back 1511 1872 81% 33 53% of the time and was most often substituted by the / b / phoneme. This is due to the small percentage obtained for the / v / phoneme in the initial position (25%). Although the voiceless phoneme / e / was per- ceived correctly only 53%, no clear substitution pattern was exhibited. Further, Table 4 shoWs the phonemes classified into distinctive features and the percentage of correct perception associated with each classification. This table represents phonemes in both the initial and final position collapsed over all degrees of time-compression. Those phonemes classified under duration were perceived correctly most often (85%). At the low sensation levels, both the voiced and voiceless phonemes were perceived correctly 73% of the time. Table 4 reveals that those phonemes classified under nasality were perceived cor- rectly 68% of the time. Those phonemes whose point of articulation is the back of the oral cavity were perceived correctly most often (81%) when compared with those phonemes with middle and frontal points of articulation (middle, 68%, and front, 63%). High Sensation Level Tables 5 through 8 depict the results obtained when the sensation levels of 24 dB and 32 d B were con- sidered across the several levels of time-compression 34 mm Nm Ill Nm ddH SNcH m th m MMH uomuuoo u U «GOHusuHumndm u m Noncommou 02 u xx mq vm Hm: N hm \Mm H N mm mm H mm mm (‘1 Hm H H H H. mMH do mm [—1 mm H <1‘ mm NM «g 1H we £51 Hm uw'w rfl m hm \H mm H oH \Hm V 7d + HMH MH 3 u: cwua p c) m u: fi_fl1$ 4: H."11§_:Li§_ om mmH 74m m A.» h HmH \«m m mMH cm SM SNH m w U w m m2 3: H n n u up mu c 8 m N m m > m m x p u .mHm>mH cOHummcmm now: an sonmmumEoonmEHu no a m mGOHuHUcoo HHM Hw>o GOHuHmom HMHuHcH can cH mmemconm How SGOHmsmcoo oweocoamll.m wands 35 (0%, 30%, 40%, 50%, 60%, and 70%). It should be noted that most of the errors can be attributed directly to the time-compression procedure as the intensity of the signal is great enough for accurate intelligibility. Tables 5 through 7 reveal those errors made when the phoneme was in the initial position, final position, and initial and final position combined. Also recorded in percentage scores is the percentage of times the pho- neme was perceived correctly and the percentage of time it was perceived as another phoneme where applicable. Comparison of Table 5 (initial errors) and Table 6 (final errors) revealed that overall there were more final phonemic confusions than initial position confusions. Further, reference to Tables 13 to 24 in Appendix A revealed that the major increase for initial position errors did not occur until the 70% time-compression level, whereas the majority of the final position errors did not occur until the 60% time-compression level. The results further indicated that as the per- centage of time-compression increases, intelligibility decreases (see Appendix A, Tables 13 to 24). In addition, it should be noted that the decrease in intelligibility is relatively gradual over the several conditions of time-compression until 70%, at which point there was a dramatic breakdown in overall intelligibility. 36 Initial Position. Table 5 reveals that all phonemes except / g / and / v / in the initial position were perceived correctly 90% (or better) of the time through all conditions of time-compression. The voiced gutteral / g / was correctly perceived 87% with the voiced lingual- dental / d / being substituted 10% of the time. The voiced labio-dental / v / was correctly perceived least often (65%) and the nasal bilabial / m / being substituted for it (20%). This suggests that the substitutions were ones associated with place of articulation. It should be noted that at the lower sensation levels (8 and 16 dB) the phoneme / b / was substituted for / v / whereas at the 24 dB and 32 dB sensation level, the phoneme / m / was substituted. At the 70% time-compression level, phonemes in the initial position were dramatically reduced in intel- ligibility when compared with the other levels of time- compression (see Appendix A, Tables 13 through 24). The percentages of correct perception of the phonemes were all depressed when compared with Table 5. The phonemes / j / and / f / tended to be the most stable at the 70% time-compression level (100% and 94% respectively). Further, definite substitution patterns occurred. The voiced gutteral phoneme / g / was substituted for the voiceless back phoneme / h / 38% of the time. It is 37 interesting to note that the / k / phoneme was most often substituted for the voiceless lingual-dental plosive / t /, and vice versa (see Table 24 in Appendix A). Final Position. Table 6 shows the results in the final position when compared with Table 5, where 18 out of 20 phonemes were perceived correctly 90% (or better) of the time in the initial position, Table 6 reveals that only 12 of 19 phonemes were correctly perceived 90% of the time or better. Dramatic decreases are seen in the / O / phoneme (92% in the initial position and 58% in the final position), the / d / phoneme (96% vs. 79%), the / b / phoneme (14% vs. 76%), and the / n / phoneme (95% vs. 79%). Table 6 reveals that the labio-dental / v / was correctly perceived 100% of the time in the final position, when compared with Table 5 the / v / phoneme was correctly perceived 65% of the time in the initial position over all levels of time-compression. Initial and Final Positions. Table 7 shows the errors made when the initial and final positions are combined over all levels of time-compression. As can be seen all of the phonemes were perceived within the eighty and ninety percentage correct categories, except the / 0 / phoneme. The / 0 / phoneme was perceived correctly least often (75%) and was substituted most often by the voice- less labio-dental / f / phoneme (15%). Table 8 reveals 38 uoouuoo u U ScOHuauHumnsm u m Noncommmu oz u mz SS N HNN N H H H N H NS SNH H N N S SS H NS SS SS N H HS N Sm SSH SS SN on H H SSH SH H H H SN N g NS N H NNH N H S a SS H NS S SS SS S N S SS NSH H S SS mm N H SN SH N H S SSH SS S SS N H H SS H S S SS S H H SS N H S NS H H NSH N S SS mm S N H H S N NS SN H S SS S H N H H SH SSN SH u SN mm N H N S SN NS N N H N H SNH W m «2 H H up mu m c E m N m 0 > m m x v u a a .mHo>mH coHummcmm ann no :OHmmmHmEOOImEHu mo mGOHuHU Icoo HHM uo>o cOHuHmom HmsHm on» cH mmamsoam How wconamcoo UHEmconmll.w mummy 39 um NSH NS haw "T— mm mmH SSH Ow rt 41.-4 3 OHM H Km OVH N '1'! OOH OOH av fim OON OH 0N mm OON mm uvH «O OO mm vON mh In Nh VH NO 33'“H OH ah NO whH Om awH OH mm OHM OH OO HMN On OH MN Ohm OH Om HhH NO muans H vON Aging—EHAJLJLJ.._L.JLLLJ:‘3 nu mu x v A .nHoboH nowuuucou now: an saw-nounfloo noaHu uo acoHuHuaoo HHu hobo uoaHnsoo ucowanom HusHu can HuHanH on» a« uuflusoaa nan occauswsoo Odlusonmll.h Manda 40 that this substitution is apparently related to a place of articulation type error. It should also be noted that Table 8 reveals that all phonemes classified under frontal place of articulation including the / f / pho- neme, a score of 91% was obtained for intelligibility. In contrast for those phonemes classified as having middle oral cavity place of articulation, including the / 8 / phoneme, a score of 89% was obtained for intelligibility. Further, Table 8 shows that phonemes classified under duration were perceived correctly most often (98%). In contrast those phonemes classified under nasality and those under middle place of articulation were perceived correctly least often (89%). Those scores were obtained for phonemes in both the initial and final positions under all levels of time-compression. 41 TABLE 8.—-Percentage of correct perception of distinctive features at the high sensation levels. +24 and 32 dB SL All Time-Compression Levels Number of Times Feature Correctly Total - % Perceived Nasal 597 672 89% Voiced 2393 2640 91% Unvoiced 2140 2304 93% Duration 656 672 98% Place of Articulation Front 1618 1824 89% Middle 959 1056 91% Back 1770 1872 95% CHAPTER IV DISCUSSION Many investigators have considered distorted speech tests as a means of distinguishing lesions of the higher auditory pathways from those involving the cochlea (Bocca and Calearo, 1963; Bocca, 1967). One form of distorted speech is electro-mechanical time-compression of the speech signal as developed by Fairbanks, Everitt, and Jaeger in 1954. The potential use of time-compressed speech for detecting lesions of the higher auditory path- way has been demonstrated by Calearo and Lazzaroni (1957); deQuiros (1964), and Beasley, Schwimmer, and Rintelmann (1972). In the latter study, normative data were gathered using a standardized discrimination test (the Northwestern Auditory Test No. 6). To date no analysis has been undertaken to determine what effects the time- compression procedure has on phonemic perceptual con- fusions. A discussion of these effects is relevant to further research involving the time-compression procedure. 42 43 Consistency of Phonemic Errors This study has shown that the replacement of one phoneme by another is not a random process, a finding supported by Lafon (1961) and Miller and Nicely (1955). This study further revealed that the intensity level of presentation above speech threshold played a role in the intelligibility of the stimuli. This occurred for all conditions of time-compression. The effects of intensity increments lessen, as would be expected, as an optimal listening intensity is approached. Calearo and Lazzaroni (1957) noted that there is a tendency for intensity to neutralize the effects of speech acceleration for normal subjects. This was further supported by Beasley gt_§l. up to the 70% time-compression ratio, whereby a dramatic decrease in intelligibility occurred regardless of sen- sation level. The consistency of the errors made is also altered by the level of presentation. At low sensation levels the phonemes most affected in the initial position were / p, t, k, f, 0, g, and v /. At the high sensation levels, only the initial position phonemes / g / and / v / fell below a 90% correct intelligibility score through the first five levels of time-compression. At the 70% compression level not only did the / g / and / v / remain unintelligible, but in addition the initial position phonemes / p, t, k, f, 0, 1, and h / became 44 unintelligible. This was also true at 70% for low sen- sation levels of presentation. It should be noted that many of the errors made at the low sensation levels may be due to the low intensity of presentation as well as to the varying time-compression ratios. Those errors made in the final position do not necessarily correspond to those made in the initial position. An example of this is the / v / phoneme whose percentage of intelligibility is low in the initial position, but becomes much more intelligible in the final position. These results were diSplayed at both low and high sensation levels of presentation. Also, in the initial position the / v / phoneme was perceived as / b / at low sensation levels. This is in support of Lafon (1961) who also found that the / b / phoneme was most often substituted for / v /. In contrast at the higher sensation levels, the phoneme / v / was most often substituted by the / m / phoneme. It must be noted that the / m / for / v / substitution occurred at the 70% time—compression ratio (see Appendix A, Table 24). This phenomenon might be explained through the durational aspects of the / m / phoneme. The / m / phoneme is longer in duration than either the / v / or / b / pho- nemes (Miller and Nicely, 1955). Consequently, at the 70% time-compression level, neither the / v / or / b / phoneme are long enough in duration to be temporally 45 processed, so the / m / phoneme, which is longer in duration and has similar voicing and place characteris- tics as / v /, is confused for the / v / phoneme. Further, the / m / phoneme is also substituted for / v / at the lower sensation levels, but not to the extent that the / b / phoneme is substituted (see Appendix A, Tables 1 through 24). Initial and Final Positions This study revealed that more errors occurred in the final position than in the initial position. This held true regardless of the level of presentation. Whereas the 70% time—compression level was the point at which phonemes in the initial position became most affected, in the final position the intelligibility of the phoneme was affected at both the 60% and 70% time- compression level. Lehiste (1964) and Hoard (1966) suggested that consonants near the beginning of a syllable tend to be longer than they are near the end. In contrast, Barnwell III (1970) found this not to be the case for stop consonants. He found no measurable differences for the stop consonants in the initial and final positions. It may be speculated that the durational aspects of the final phoneme is shorter than that of the initial phoneme in conversational speech. Then when the signal is temporally processed the point of reaching 46 the minimal time needed for temporal processing occurs at a lower level of compression than it does for phonemes in the initial position. Effects of Time-Compression Levels Although the effects of time-compression ratios and levels of presentation cannot be separated when dis- cussing their effects upon intelligibility of phonemes, it is felt that the role that time-compression plays in determining intelligibility can be discussed for the +24 and 32 dB sensation levels. The results of this study indicate that a gradual decrease occurred in phonemic intelligibility as the duration of the signal was reduced. This decrease was gradual until the 70% time-compression level, where a dramatic reduction in intelligibility occurred. It is felt that these errors are due directly to the time-compression procedure and deserves some discussion. Table 9 shows those phonemes most affected and those phonemes most often substituted for them. Many of these substitutions can be explained according to their durational characteristics. For example, the voiceless phoneme / p / consumes less time in production (Jakobson gt_gl,, 1952) than voiced plosives such as the / g / which has been substituted for / p /. The longer durational aspects of the voiced plosive / g / allows more time for temporal processing. 47 TABLE 9.--Phonemes most frequently missed at the 70% time-compression level in the initial position. Also shown are those phonemes most often sub- stituted for them and the percentage of sub- stitution. Phoneme % Correct Suigzgiflied % p 54% g 17% t 63% k 20% k 87% t 13% f 69% b 25% b 75% g 17% g 50% d 31% v 13% m 50% l 50% h 19% h 54% g 38% 48 Thus, the intelligibility of the sound is increased and enhances its substitution for other phonemes with shorter durational characteristics, such as / p /. Further, the / g / phoneme's place of articulation plays a role in its substitution for / p /. This study has shown, those phonemes whose place of articulation are the back of the oral cavity (such as / g /) are more intelligible than those phonemes whose place of articulation are either the middle or front of the oral cavity. Thus, the / g / phoneme is more intelligible because of its place of articulation and longer durational characteristics and is more likely to be substituted for other phonemes. This is further demonstrated by the / g / for / b / substitution and / g / for / h / substitution patterns. Lafon (1961) reported that the phoneme / k / is most often substituted for / t /, and was further supported by this study. This substitution may be explained by discussing the characteristic qualities of the two phonemes. Both phonemes are voiceless plosives and only differ in their place of articulation and timbre, thus making the distinction between the sounds much.more difficult for the listener and enhancing the substitution pattern exhibited. In contrast, in the final position, the / t / phoneme is replaced by the / p / phoneme and the / b / phoneme is replaced by the / m / and / n / phonemes. 49 These changes in substitutions may be explained by the role which the vowel plays upon the phoneme following it. This is supported by Lehiste and Shockey (1971) who found that the vowel carries coarticulatory information as to the identity of the following consonant. Distinctive Features The role of distinctive features upon the intel- ligibility of the phoneme is important in aiding the listener to distinguish one phoneme from another. In order to decode the message, the receiver extracts the distinctive features from the perceptual data (Jakobson, Fant, and Halle, 1963). Miller and Nicely (1955) con- firmed that the perception of each of these features is relatively independent from the perception of the others. In this study the phonemes were classified into five groups of distinctive features: (1) voicing, which included the phonemes / b, d, g, v, z, 9, m, mg, l, r, hw, and j / (2) unvoicing, / p, t, k, f, O, s, f, ti, and h / (3) nasality, / m, n, andg , / (4) duration, / s, f, d} , and z, / and (5) place of articulation, which was divided into three subgroups, (a) front, which included the phonemes / p, b, f, v, and m / 50 (b) middle, / t, d, O, s, z, n,9, j, and tf / and (c) back, / k, g, f, c? , r, l, h, and hw /. Miller and Nicely (1955) examined 16 initial consonants followed by the vowel / a / using varying signal-to-voice ratios and band-pass filters. The results revealed that voicing and nasality were much less affected by a random mashing noise. In contrast, this study examined consonants in both the initial and final positions with various vowels. The results of this study revealed that the phonemes characterized by the distinctive features of nasality and voicing were also affected by time-compression along with those phonemes characterized by other features. This occurred at both high and low sensation levels. Further, it showed that those phonemes with the greatest duration were least affected by time-compression. This finding is expected and supported by Jakobson (1952) in his discussion that phonemes with the greatest duration are most intelligible. Conclusions In summary, then, this study has shown that a consistency in phonemic errors does exist and may be due to a change in the speech signal by various degrees of time-compression. Further, these phonemic confusions differ depending upon the placement of the phoneme in the word. Different substitution patterns were exhibited 51 for the same phoneme when in the initial position of the word as opposed to the final position. The sensation level of presentation also affected the type of phonemic errors. An example of this was the / b / for / v / substitution at low sensation levels whereas the / m / phoneme was substituted for the / v / Phoneme at the high sensation levels. Finally, this study has shown that the distinctive features of the phonemes were also affected by the time- compression procedure. Those phonemes with the greatest duration tended to be least affected by the various time- compression ratios. Implications for Further Research A spectographic study of the phonemes would help to determine what effect the time-compression procedure has on their physical characteristics. A study of this nature might also be beneficial in further helping to explain many of the substitution patterns that were demonstrated in this study. With a spectrographic study, it might be determined if the energy and/or formant structure of the phonemes are altered by time-compression, causing one phoneme to take on physical characteristics of another phoneme, thus enhancing the substitution pat- terns exhibited in this study. 52 Perhaps the most obvious area for further research is to analyze lists II, III, and IV of Form B of the NU-6 in the same manner. The comparison can then be made between those results and the results obtained in this study. Beasley gt_al. (1972) found list I to be the most difficult of these lists, and list IV to be the least difficult. Thus, a comparison of list I and list IV may be beneficial in confirming their degree of diffi- culty. Another advantage of comparing all four lists WOuld be in averaging those scores obtained in order to further substantiate the findings of this study. Another area of research is the effect time- compression has on vowels. This should also be done for all four lists of Form B of the NU-6. With this type of study it could be determined if vowels, like consonants, display a definite substitution pattern. Further, it would determine which vowels retain their intelligibility during the time-compression procedure. LIST OF REFERENCES L IST OF REFERENCES Aaronson, D., Temporal factors in perception and short term memory. Psychological Bulletin 130-144 (1967). Barnwell III, T. P., Some syllabic junctural effects in English. MII Quarterly Progress Report 99, 149- 159 (1970). Beasley, D. S., Schwimmer, S., Rintelmann, W. P., Intel- ligibility of Time-Compressed CNC Monosyllables, Journal of Speech and Hearing Research, in press Beasley, D. S., and Shriner, T. H., Auditory Analysis of temporally distorted sentential approximations. Audiology: Journal of Auditory Communication, in press 71972). Black, J. W., Journal of Speech and HearingyDisorders, 17, 409 (195271 Black, J. W., and Singh, S., The psychological bases of phonetics, Manual of Phonetics, (ed.) B. Malmberg Netherlands, I05-l25 (I975). Bocca, E. and Calearo, C., Central hearing processes. Modern Developments in Audiolo . J. Jerger, ed., New York} Academic Press (I96 . Bordley, J. E., and Haskins, H. L., The role of the cerebrum in hearing. Annals of Otology, Rhinology, and Laryngology, LXIV, 370-382 (1955). Calearo, C., and Lazzaroni, A., Speech intelligibility in relation to the speed of the message. Laryngo- scope, 67, 410-419 (1957). Daniloff, R. G., Shriner, T. H., and Zemlin, W. R., Intelligibility of vowels altered in duration and frequency. Journal of the Acoustical Society of 53 54 deQuiros, J., Accelerated speech audiometry, and exami- nation of test results (Trans. by J. Tonndory). Translations Beltone Institute of Hearing Research, No. I7, BeItone Institute of HearIng Research: Chicago (1964). Eldert, E., and Davis, H., The articulation function of patients with conductive deafness. Laryngoscope, 61, 891-909 (1951). Fairbank, G., Everitt, W., and Jaeger, R., Methods for time of frequency compression-~expansion of speech. Translations I. R. E. P. G. A., AU-7, 7 - 12 (I954). Fletcher, H., Speech and Hearing in Communication. Van Nostrand’Co., New York (19587} Fournier, J. E., L' analyse et 1' identification du message sonore. Journal Franc. Oto-Rhino- Laryngol., 67 (1956). Hirsh, I. J., Davis, H., Silverman, S. R., Reynolds, E. G., Eldert, E., and Benson, A. W., Development of material for Speech audiometry. Journal of Speech and Hearing Disorders, XVI, 321-328 (I952). Hoard, J. E., Juncture and syllable structure in English, House, A. S., and Fairbanks, g., The influence of conso- nant environment upon the secondary acoustical characteristics of vowels. Journal of the Acoustical Society of America, 25, I05-II3, (1953). Jakobson, A., Fant, C. G. M., and Halle, M., Preliminaries to speech analysis. Mass. Inst. of Technology, Acoustic LaboratoryL‘TeEhnical Report, No. 13, (1952). Jakobson, R., and Halle, M., Phonology in relation to pho- netics. Manual of Phonetics, Malmberg, ed. Netherlands (1970). Katz, J., Differential diagnosis of auditory impairments. Audiometry for the Retarded. Robert T. Fulton and Lyle L. Lloyd, eds. Baltimore: Williams and Wilkins Company (1969). Lafon, J. C., Message et phonetique. Presses Univ. de France, Paris, (1961). 55 Lehiste, I., Juncture. Proc. of the Fifth International Congress of Phonetic Sciences 7196471 Lehiste, I., and Peterson, G., Linguistic considerations in the study of speech intelligibility. Journal of the Acoustical Society of America, 31, 285-286 (1959). Lehiste, I., and Shockey, L., Coarticulation effects in the identification of final plosives, paper pre- sented at the 82nd meeting of the Acoustic Society of America, Denver, Colorado, October 19-22 (1971). Malmberg, 3., Manual of Phonetics. Bertil Malmberg, ed. Netherlands (1970). Miller, G. A., Heise, G. A., and Lichten, W., The intel- ligibility of speech as a function of the context of test material. Journal of Experimental Psy- chology, 41, 229-335HTI95171 Miller, G. A., and Nicely, P. E., An analysis of per- ceptual confusions among some English consonants. Journal of the Acoustical Society of America, 27, 338-352 (I954). Peterson, G. E., and Borney, H. L., Control methods used in a study of the vowels. The Journal of the Acoustical Society of America, 24, 175-184 (I952). Rintelmann, W. P., and Jetty, A. J., Reliability of Speech discrimination testing using CNC monosyllabic words. Unpublished study, Michigan State Uni- versity (1968). Shriner, T. H., Beasley, D. S., and Zemlin, W. R., The effects of frequency division on speech identifi- cation in children. Journal of Speech and Hearing Research, 12, No. 2, 4I3-422 (1969). Stevens, S. S., Handbook of Experimental Psychology. John Wiley and Sons, Inc. New York: (1948). Thorndike, E. L., and Lorge, I., Teacher's word Book of 30L000 Wbrdg, New York: Bureau ofIPublicatIOns, TeaEher's College, Columbia University (1950). Tillman, T. W., and Carhart, R., An expanded test for speech discrimination using CNC monosyllabic words: Northwestern University Auditory Test No. 6 (1966). 56 Tillman, T. W., Carhart, P., and Wilbur, L. A., United States Air Force School of Aerospace Medicine, SAM-TDR-62-135, AD No. 403275, Brooks Air Force Base, January, 1963. Williford, J. A., Audiological evaluation of central auditory disorders. Maico Audiological Library Series, VI, 1-7 (1969). Zemlin, W. R., Daniloff, R. G., and Shriner, T. H., The Difficulty of listening to time-compressed speech. Journal of Speech and Hearing Research, 11, 875- 881 (1968). APPENDICES APPENDIX A CONFUSION MATRICES FOR PHONEMES IN THE INITIAL AND FINAL POSITIONS AT EACH LEVEL OF TIME-COMPRESSION FOR LOW (8 and 16 dB) AND HIGH (24 and 32 dB) SENSATION LEVELS 57 OH uoouuoo u U SGOHusuHumnsm u m «uncommon 02 u mz SN OOH Nm MN H OOH) OOH ’51 l""J.£._.':‘l_i§... 0H OH NS OOH OH my cH NO OOH OH OH .EJ£L OOH mm om HO HO mm OOH ON .12 H H N ON 0.. mz 3S H n S S SS S» a a S N S S > S S x S S .mHm>mH GOHummsmm 30H um COHmmonEoo ImEHu So as GOHuHmom HMHuHcH on» :H mosoconm How SGOHmsmcoo owemsonmul.Hl< HANNH 58 uomuuoo u U HGOHusuHumnsm u m Noncommmu oz u mz mm mm H H OOH OOH vm mm om mm mm vm mH H OOH «N mm H u: N o to N u: E c: ‘FntLu OOH HO MH N H ‘H Om om mm H N mm ON hh SN mm an H H H H H OH % H NSSSOSESNSS>SSH Sana .me>mH coHummcmm 30H um sonmmHmEoo ImfiHu NO as GOHUHmom HmsHm may GH mwfiosonm How mcowmsusoo UHEmconmna.Nlc Dandy 59 SSJO n. uouuuoo u U HGOHusvHumnsm u m Ow ooH Noncommmu oz m2 15.115. N OH U) OH OH Li» n so O ,4 «ON N Sara <'\O r4 N N QIII‘U_ELL¥_J3 ha > q> w w m mzzSHn .NN Numuas .mHm>wH COHummsmm 30H um conmwumEoo m N S>SSS v a n m IwEHu mom um GOHuHmom HMHpHsH 0:» CH mofiwsosm How SGOHmsmcoo GHEmconmal.mI¢ mqmdfi 6O uomnuoo u U «COHusuHumnsm u m Noncommou oz u mz mm mm H H H Om mm H 1 SN S H H NS SN NH N N SSH S NS Nm H H S SH H H H SS N H SS H SH H SS HN N SS H N SSH S SS SS S N SS H SH H SS H H NN H N SS Hm N H S H SH H SS H H H N SS N SS "SSNS N N HH HN mm H H H S NH o In N U) E £2 a SJVU #: m w o S Sz H S SS SS 9 c a S N S S > S S x S u S S .mHm>mH GOHumwcmm 30H um GOHmmmHQEoo ImEHu won um sOHuHmom Hmch on» cH mmsmsonm How SQOHmsmcoo UHEmsonmII.SI< mamas 61 uoouuou u U NsOHusuHumnsm u m NN H H «uncommon oz u mz Md Hm. H. OH O H “PAS—ig— OOH OH H OH fig ON H H txUSS hm4 OH \ONNr-i vH N HN N be 0‘44 p Q) m ON OH 4.) ON .0 ON (‘1 N v-l M H H OH O. SzzHHn SHNSSSSESNSS>SSHSS w 0309‘.- l .mHm>mH GOHummcoS 30H um conmonEoo n m ImEHu SOS um GOHUHSOQ HoHuHsH may cH mmfimconm How SGOHmomcoo UHEmconmll.m|< mqmdfi 62 uomuuoo u U «coHusuHumnsm u m Noncommmu oz u Oz SS SN H H HS NN N H SSH S NS NS SH H H S mm mm O N I) NS H H SN S H H S N H g SN H H SH N N s SSH S S HS H NH N N NS H HN H N S NN SSN N H N H S SSH S NS SSN H SH H S S NS H N H SH N S N H H SN N N H SN HSS N H H H N N SH S NN H N S NN N u NN Sm H H H N N S NS N+H H N SN S w w mz H N NU mu 0 c E N m m o m m S > m x p u o m .mHm>mH GOHummcmm 30H um :OHmmmHmEoo ImEHu SOS um GOHuHmom HNsHm 039 CH mmEmsonm How SGOHmswsoo OHEmsonmII.OI< wands 63 uoouuou u U HsOHusuHumnsm u m Noncommmu oz u mz OO OO OH ON H 3S mm OOH W1.G r1 Om Hm H OOH OH NO OOH OH OH NO NO OH N OOH OH Hm NN H NO hm Ow Ow Oh HO Oh OH NH mm OH mm H m H U O m mz 3n HS.n .H NO mu s a m N a O > w m x o .me>mH :oHummcmm 30H um conmonEoo u n On ImEHu won as QOHuHmom HoHuHcH wnu cH mmfimconm Now SGOHm5msoo 0Hfimconmll.bld HHOOH 64 uowuuoo u U «:oHusuHumnsm u m «uncommmu oz u mz mm Om H H OOH vN OOH O mm OH OOH O NO H mN OO H OH 5O H m5 H N NH H OOH VN Oh OOH mm 0 “in H mm H 0" Oh cq*~i mv go :0 HH Nb ON 3% H v Oh HHMH cu 0.3L u t! mz H u nu mu 0 c E mop a L .mHm>mH GOHummcmm 30H um conmmumfioo ImEHu mom um cOHuHmom Hmch 0:» CH mmeonosm How mconamcoo UHEmconmau.O m N m o>mmx c [fl Wflmfifi 65 uoauuou u U “:OHusuHumndm u m «manommmu oz n ma cm H H ooH H rligpLfi§_ Nu mu OH H N OH 8, \MH H CD 0 m5 H N H V‘ O .0!“ ,..| \0 Ch CV m <9 N vi “a A-.. Ni H «3 o: Oak: H-5LJK firIH mz 3: a a H u nu ma c a m a m o > m m x c u a mop .mHm>mH GOHummcmm 30H um conmeQEoo ImEHu OOO um cowuwmom HMHuHaH mnu CH mmemconm How mGOHmsmcoo OHEmnonmII.ml¢ mqmdfi 66 uownuoo u U «COHusuHumnsm u m “mmcommmu oz u mz NO N mm N H N \H mm 3 N v u. OOH O \mm mp H NH m my Oh H O H . MM MW O H OH N N N O O H n b N m N N m N s 8H m m OO N m N m N 5O H HN H H m om mum H v m Q Oh H O H hm WV O m h M OO H H HH H H H 0 Oh v H ON H H x o om N H N N m o LNqH OO wm v H H O H ON H O u N osm N O. N O OO H N H m H OH w w mz H M Na mu 0 a E m N m > m a u m a m x o u a .mHo>mH coHummcmm 30H um :onmmHmeoo nmEHu wOO um GOHuHmom HMGHM may GH mmfioconm How mGOHmsmcoo UHEmconmll.OHL4 ma fizz. 67 uomuuou u U “GOHunuHHnnam u m «mmcoamou 02 u m2 um wmx N. NH H H H H 3 ildd, Mm «m mfiHw H» H H 1H WWI m cm mN N HH H o H H m— Illa!“ on N H H H on HN m N HH H H H H a H on N m H - H H N H H. NH H H H OH N H mu 1m. Hm. m H m H H N E SH SH HON H NT H m m H H H d Hm H H mH H m mH m H H N H HH N H m mN N NF H H H H O \NH\ wm H «H H H» .H11 m mN OHM N H H m L mm om H H H N m m my 3 oN m. H N H H m H H I! H H mm mm m N H H H H N WK I w NH MN m J H m N x H mm mm O H H H O O inlmmlW a V. H H N H H Q ~ W w mz 3: H n n H av mu 6 a m N a O > M O x v u n .mHm>mH cOHummamm 30H um conmmHQEoo ImEHu won an GOHuHmom HMHuHcH 0:» cH mmewaonm How mGOHmsmcoo UHEoconmul.HHI¢ MHOOH 68 uomuuoo u U NGOHusuHumaam u m “manommmu oz u mz mm H NN H H N H H H H H Hm N NH H N INN A NN mm N H N N Nw NH wm‘m N N N m» NH N N H mN [mm N N HH H H H H N H m : NN o H H N N H N N e NN .NN mm m m H H O NH H H N N N N H N Hm N H NH H N N a o N H o N N H N NH H N H H H H H p O MRO m N H H H» H N. Ln» H NN m H H H, \H\ m N m NH N yanw, yumnm N H HH N H N 0 mm H H H H H H N m o N H .rlnllr JFI L! b HH ,mw OH H N H H N H H H 0N H H . u o mw H N H o O NH H H H N H H N HH w w mz H N D U m H U W“ C E m N m 0 > M m M U U D Q .mHm>mH GOHuMmamm 30H um COHNmmumsoo ImeHu NON um coHuHmom Huch an» :H mmeoconm mom maonamcoo OHEmcoamll.NH|¢ wands 69 uomuuoo u U «GOHuauHumnum u m «manommmn oz u m2 OOH «N OOH Nm OOH OOH OOH OOH OOH OH OOH OOH OOH OH ON OH OOH ON OOH _ OOH OOH OH OOH OH OOH vN OOH «N OOH OOH Zfl OOH m 3 :L»n +arg u: o~u4 > ¢> m ZLLEJiLg“ Q; H '1i5_:Lfi§. V' N a U w mz 3: H a N u Nu mu : a m N m m > m m x o u n a , II .wHo>wH QOHummcmm ann um GOHmmmumfioo ImEHu mo um aoHuHmom HMHuHaH may :H mmEocoam How mnonsmcoo OHEmnonmll.MHI¢ mqmfia 70 uoouuou u U «coHusuHumnsm u m “mucommmu 02 u mz OOH ov OOH OOH hm H VH H 43; OOH‘ ov OOH vN OOH OOH OH mm MN H NO OOH mm OOH OOH hm hm OOH JLJJ OOH w [lufi w m mz H u no mu 0 a E m N m O > m m x U .mHm>mH GOHummcmm 50H: um GOHmmmHmEooImEHu Ham no um coHuHmom Hmch may CH mmamconm you mGOHmswcoo UHEwaonmll.HHL< Manda 71 uomuuoo u U NGOHuapHumadm u m «mucommmu 02 u m2 OOH ON OOH Nm OOH ON OOH OOH Nm H ”1i§_:LxE_ OOH OH NH. OOH OH OOH OH OOH OOH VN OH dim 8H! vN U) OOH (D NO ON m N mm OH H OOH OH mm OOH VN OOH ON OOH «N mm MN 0.4: u thx tnhu O U m «2 3: H n n M NO mu a a m N n O > w m x U u n m .mHm>wH coHummcmm ann um aonmmumfioo O IGEHu wOm um GOHuHmom HnHuHcH 0:» CH mwemconm How m:OHmamcoo OHquonmll.mHI¢ mqmde 72 nomnuoo u U «GOHusuHumnsm u m «mmaommmu oz u mz OOH ow H OOH «N H OOH v3 OOH LOTHW UJ OOH hm mm N m OOH VN OOH m Ii : 9*» OOH OH OdH vN OOH OOH mm OOH Om hm hm pm \0 H .n +Jiu A4 u~u4 2 C) m OOH vN UM mzHuNcmuoasmNmm>mecunm UJd’ .mHm>mH GOHummcmm ann um QOHmmmumfioolmeHu «on um coHuHmom Hmch may :H moemaonm How mconsmcoo OHEmaosmII.OH34 Handy 73 uomuuoo u U «GOHusuHumnsm u m «mucommou 02 u m2 vN vN Nm OH OH vN‘ .EJ£L‘A 4: H W~L§_:Li§_ OH VN OH OOH VN h: u~u4 p Q) m OOH Nm NV NN JJ Om 4.4“ .0 OOH vN w U m mz 3n H n N N NH N» a a m N m N > N m x u u n m l .mHm>mH GOHummamm 50H: um GOmemumfioo ImEHu now an GOHuHmom HNHHHQH 0:» GH mmsmconm mom m:0Hmsmcoo oHEmconmul.hHI4 mamas 74 uowuuoo u U «COHuCuHumnCm u m «mmCommmu oz u mz OOH 1de OOH VN ch mm OH OOH Om OOH OOH OOH OH OOH VN NO OOH mm OH \dOH OH ooH Nm Oh H OH v mm H OH H OOH _C +Jbu A4 u~u4 2 <2 0) N a: E :: OOH vN w [.01 w m mz H .H 33 o : .mHm>mH COHummCmm CmHC um ConmmHmEoonmEHu wow um COHuHmom HNCHO mCu CH mmsmCOCm How mCOHmCMCoo UHEmCOCmnn.mH|< mamCH E m N m O > m m x U u a m 75 uoouuoo u o NCOHuCuHumnCm u m «mmComumu 02 u m2 OOH Nm OOH Nm OOH OH OOH OH OOH OH mm H ON a“ OOH HN fi$ OOH N N .H NO m m OOH OH OOH ON (I: OOH w 9? OOH m OOH OH hm OH N mm OH ON v OOH ON NO H NN Ood‘ vN OLHMHNON w w 0 m mz an H a H N Nu mu : a m N a O > u m x o u a m , I .mHm>mH COHummCmm COHC um COHmmmumEoo nmEHu wom um COHuHmom HNHuHCH 0C» CH mmEmCOCm How mCOHmCmCoo UHEmCOCmu|.mHIC mamas 76 uomuuoo u U «COHuCuHumnsm u m “mmCommmu oz u mz OOH OO OOH ON LLIH \dOH OOH (ON OOH om mm H MN Oqu N hm OH N N u: E 1: =4# OOH ON ON ON . m: (N OOH hm OH N mm OOH Nm mm ON \O mm O OO H ON 42 # w:.x ID‘H ON N O Hm H H NN «2 H N NO mu m c a m N H O > N O x O u n a Haw .mHm>mH COHummCmm smHC um COHmmmNmEoonmEHu wom um COHuHmom HMCHM mCu CH mmEmCOCm Com mConCMCoo OHEmCOCCII.ONI« HHCOB 77 uomuuoo u U NCOHuCuHumnCm u m NomCommmu 02 u m2 _IIOOH_ OOH OOH ON Nm . 1“ ON OOH OOH Nm OOH OH NO OOH OH mu OOH OH mm H MN OOH OH OOH ON NO Oh OQH OH hm OHw .IMI OINHIIIqN 5 OH O OOH ON Hm Ni NN mm H ON Hm N NN QLJJ u gng twin :> O In W w U M mz 3: H C n u NO mu C E [III m N n O > M mxounm .mHm>mH COHummCmm CmHC um COHmwwumfioo ImEHu wow um COHuHmom HNHuHCH mCu CH mmEmCOCm mom mConCmCoo UHEGCOCCII.H~I4 Handy 78 uoouuoo u U «COHuCuHumnCm u m “mmCommoH oz u m2 NO mm H H H OOH ON OOH OOH OH OOH HO Om m _ OOH OOH OOH ON OH OOH ON Oh OOH mm OOH OOH IN :0 N +lbuLg u~u4 > c: g; N ca 55 C 95mi*u prH HN w m «2 H M NO mu m C E m N N O > O O x O u a Q Q .mHm>mH COHummCmm COHC um COHmmmumEOOImEHu NOO um COHuHmom HMCHM 0:» CH mafioCOCm How mConCMCoo OHEmCOCmII.NNI¢ mqmdfi 79 uowunou u U “COHuCUHumasm u m «mmCommuu 02 n ma NO % NN H 3 ON OOHd NH 7 H m filflWl, mm N \MH m1 HA THNH N N NO JW 4 N 5 N N OH HO OH H N NO 14mm OH H |I> mu mO N HH 1H N .L NO N NH N L Hm H NH O NO N HN N NO N N H O NH oxN H H H HIP H NH MN H HH H m ON HON H N n H N O. NO NO HN N mm ON NM N H OH H O OO Wm M NH O “H u“ NN NNH N H O1 m. . Hm NNH d H H N NH a w m «2 3n H n .N N NO N» a a N N N O > N O x O u n m .me>mH CoHummCmm ana pm ConmemEoo ImEHu won an CoHuHmoAH HmHuHCH 0:» CH mmEmCOCm How mConCMCoo OHEmCOlel.MNI4 mqmda 80 NO NO uumuuoo u U «COHuCuHumnCm u m «mmCommmu oz u m2 H OH H N W NO NO hm OH OOH OO ON O H H H H O w“ OH1m OH N hm Ob NH O NIQEJ: OOH ON 9 hm OOH Oh MO H h m H Om ON OO H OH OH H OO NM H’UHU‘HkG MNKDOHV'N mz H H NONHOOENNNO>NOHO .mHo>mH CoHummCmm OOH: um COHmmmNOEooleHu u a «on an COHUHmom HOCHO OCH CH mmsmCOCm Mom mCOHmCMCoo OHEOCOCOI|.ONI¢ MANOR APPENDIX B LIST 1, FORM B, NORTHWESTERN UNIVERSITY AUDITORY TEST, NUMBER 6 LIST I, FORM B, \DmdmmwaH O to N none N «are H hard P'Fd H hard H m ‘5‘» hard c>xo m ~4 m.Ln h.co N ha o O O O O O O O O O O O O O O O 0 APPENDIX B AUDITORY TEST, burn lot sub home dime which (or witch) keen yes boat sure hurl door kite sell nag take fall week death love tough gap moon choice king 81 NUMBER 6 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. NORTHWESTERN UNIVERSITY size pool vine chalk laud goose shout fat puff jar reach rag mode trip page raid raise bean hash limb third jail knock whip met APPENDIX C DISTINCTIVE FEATURE CLASSIFICATIONS APPENDIX C SONORITY FEATURES I. vocalic/non-vocalic acoustically - presence (vs. absence) of a sharply defined structure; genetically - primary or only excitation at the glottis together with a free passage through the buccal tract. II. consonantal/non-consonantal acoustically - presence (vs. absence) of a char- acteristic lowering in frequency of the first formant, a lowering which results in a reduction of the overall intensity of the sound and/or of only certain frequency regions; genetically - presence (vs. absence) of an obstruc- tion in the buccal tract. Vowels are vocalic and non-consonantal. Consonants are consonantal and non-vocalic. Liquids are vocalic and consonantal (with both free passage and obstruction in the buccal cavity and with the corresponding acoustic effect). Glides are non-vocalic and non-consonantal; they never participate in the oppositions grave/acute and compact/diffuse and the basic or only glide of a given language is a one-feature phoneme in opposition to a phonemic zero. 82 83 III. nasal/oral (properly speaking, nasalized/non- nasalized) acoustically - presence (vs. absence) of the char- acteristic stationary nasal formant with a concomitant reduction in the intensity of the sound and an increased damping of certain oral formants; genetically - mouth resonator supplemented by the nose cavity (vs. the exclusion of the nasal resonator). IV. compact/diffuse acoustically ~ concentration of energy in a rela- tively narrow, central region of the auditory spectrum (vs. a concentration of energy in a non-central region), with a concomitant increase (vs. decrease) of the total amount of energy and its spread in time; genetically - forward-flanged vs. backward-flanged. The difference lies in the relation between the shape and volume of the resonance chamber in front of the narrowest stricture and behind this structure. The resonator of the forward-flanged phonemes (wide vowels, and velar or palatal, including post-alveolar, consonants) is horn—shaped, whereas the backward-flanged phonemes (narrow vowels, and labial or dental, including alveolar consonants) have a cavity that approximates a Helmholtz resonator. In vowel systems this feature often appears to be Split into two autonomous features - compact/non-compact 84 (higher vs. lower concentration of energy in the central region), and diffuse/non-diffuse (higher vs. lower con- centration of energy in a non-central region). V. abrupt/continuant acoustically - silence (at least in the frequency range above the vocal cord vibration) followed and/or preceded by a spread of energy over a wide frequency region, either as a burst or as a rapid transition of vowel formants (vs. absence of abrupt transition between sound and "silence"); genetically - rapid turning on or off of source either through that swift closure and/or opening of the buccal tract which distinguishes plosives from constric- tives, or through one or more taps which differentiate the abrupt liquids like a flap or trill / r / from con- tinuant liquids like the lateral / l /. VI. strident/non-strident (mellow) acoustically - presence (vs. absence) of a higher intensity noise accompanied by a characteristic amplifi- cation of the higher frequencies and weakening of the lower formants; genetically - roughredged vs. smooth-edged: supple- mentary obstruction creating edge effects (Schneidenton) at the point of articulation distinguishes the production of the rough-edged phonemes from the less complex impedi- ment in their smooth—edged counterparts. 85 VII. checked/unchecked acoustically - higher rate of discharge of energy within a reduced interval of time (vs. lower rate of discharge within a longer interval), with a lower (vs. higher) damping; genetically - reduced (vs. non-reduced) portion of air due to the stoppage of egressive as well as ingressive pulmonic participation. Checked phonemes are implemented in three different ways—-as ejective (glottalized con- sonants, as implosives or clicks). VIII. voice/voiceless acoustically - presence (vs. absence) of periodic low frequency excitation; genetically - periodic vibrations of the vocal cords (vs. lack of such vibrations). PROTENSITY FEATURES Ix. tense/lax acoustically - longer (vs. reduced) duration of the steady state portion of the sound, and its sharper defined resonance region in the spectrum; genetically - a deliberate (vs. rapid) execution of the required gesture resulting in a lastingly stationary articulation; greater deformation of the buccal tract from its neutral, central position; heightened air pressure. 86 The role of muscular strain, affecting the tongue, the walls of the buccal tract, and the glottis, requires further investigation. The difference between tense and lax phonemes parallels that between notes played legato and staccato, respectively. TONALITY FEATURES X. grave/acute acoustically - predominance of the low (vs. high) part of the Spectrum; genetically - peripheral vs. medial: peripheral phonemes (velar and labial) have an ampler and less com- partmented resonator than the corresponding medial phonemes (palatal and dental). In the nasal consonants this feature is sometimes split into two autonomous features--grave/non-grave, and acute/non—acute--based on the interplay of the nasal murmur and oral release. The pitch of the resonator murmur effected in the nasal cavity plus the adjacent portion of the buccal cavity from the velic to the oral stricture is lower when the occlusion is made in the anterior part of the mouth cavity as compared to the structure in its posterior part. In / m / the two-fold low pitch is grave, in / / acute, whereas in dental and velar nasals this opposition may be neutralized by 87 the discrepancy between the gravity and acuteness of the two pitches (murmur and release or vice versa). XI. flat/non-flat acoustically - flat phonemes are opposed to their non-flat counterparts by a downward shift and/or weaken- ing of some of their upper frequency components; genetically - the former (narrowed-slit) phonemes, in contradistinction to the latter (wider-slit) phonemes are produced with a decreased back or front orifice of the mouth resonator and a concomitant velarization which expands the mouth resonator. XII. sharp/non-sharp acoustically - sharp phonemes are opposed to their non-sharp counterparts by an upward shift and/or strengthening of their upper frequency components; genetically - the former (widened—slit) phonemes, are produced with a dilated back orifice (pharyngeal pass) of the mouth resonator and a concomitant palatalization which restricts and compartments the mouth cavity. HHIHUIIIH 2586