THE USE or FREQUENC-Y 3mm!) AND‘T-IME RESTORE!) SPEECH mm mama IMPAERED CHILDREN - Thesis for the Degree of M. A. MiCHEGAN STATE UNIVERSiTY NANCY LOW MOSHER 1973 ”26.3.; MSU LIBRARIES .3“... RETURNING MATERIAL§z Place in book drop to remove this checkout from your record. FINES will be charged if book is returned after the date stamped below. ABSTRACT THE USE OF FREQUENCY SHIFTED AND TIME RESTORED SPEECH WITH HEARING IMPAIRED CHILDREN By Nancy Low Mosher Energy cannot be created or destroyed, but it can be modified. Acoustic energy can be modified by manipulating certain characteristics of the acoustic signal. In the past investigators have manipulated the frequency characteristics and frequency and temporal characteristics of acoustic signals. A number of different manipulations have been performed on the frequency characteristics of the acoustic signal. Vocoders have filtered out part of the acoustic signal and resynthesized the speech signal in a form different in acoustic characteristics from that of the input. Frequency shifting systems have shifted the acoustic energy up or down by a certain number of cycles. Frequency transposing systems have taken out some of the high frequency energy and re-entered it into the acoustic spectrum at a lower frequency region. All of these methods of frequency manipulation have distorted the relative relationship between the overtones and the fundamental frequency. Other investigators have manipulated the temporal char- acteristics of the speech signal. Several methods have been used to vary the temporal characteristics of the signal. The "chOp-splice" method and the sampling interval-discard interval method are based, in principle, on the redundancy of the acoustic signal, that is, intelligibility can remain high while certain portions of the acoustic signal are dis- carded. Changing the playback speed of a tape recorder- player is another method for altering the temporal charac- teristics of the acoustic signal. However, this method simultaneously changes the frequency characteristics of the signal. Frequency and temporal manipulations by fast and slow playback procedures allow the relative relationship between formant energy concentrations to be retained. Recently, frequency shifting by slow playback has been studied. This method lowers or raises the acoustic energy while retaining the relationships between the energy concentrations. Different percentages of frequency shifting have been related to intelligibility. The temporal characteristics which are distorted by the slow playback method are restored to normal by the sampling method of time compression, thus reSulting in a frequency shifted-time restored speech signal. It has been suggested that the use of frequency shifted-time restored speech signals may provide more acoustic information for hearing impaired persons, in that the signal is shifted into the residual range of hearing of individuals having sensorineural hearing losses with low frequency residual hearing. Persons with residual hearing in the low frequen- cies retain the potential for auditory discrimination, thus permitting maximal use of auditory training procedures in educational settings. The purpose of this study was to determine if hearing impaired children could respond better auditorily under conditions of 35% frequency shifted—time restored or 0% frequency shifted-time restored speech stimuli, and further, whether training would enhance the scores on these tasks. Subjects were eighteen hearing impaired children, aged 6-9 years, whose hearing losses were characterized by low frequency residual hearing with no measurable hearing above 2000 Hz. The subjects were matched on relevant variables and divided into two groups for training purposes. Monosyllabic words from the WIPI, preceded by a carrier phrase, were presented to the subjects via tape recordings. One set of tapes were frequency shifted and time restored by 35%, while the second set was frequency shifted and time restored by 0%. All subjects were pre-tested and post-tested on the WIPI stimuli under both the 0% frequency shifted-time restored and the 35% frequency shifted-time restored conditions. Nine children received auditory training with the 35% frequency shifted-time restored signals and the other nine received auditory training with the 0% frequency shifted-time restored signals. The training was given for twenty minutes per day for fifteen days. The differences in percentage correct scores from the pre-test to the post-test were compared with respect to the type of training the child received. Results showed an increase in mean percentage correct scores of l#.3% from the frequency shifted-time restored pre-test to the 35% frequency shifted-time restored post— test for the group who received training under the 35% frequency shifted-time restored condition. For the 0% frequency shifted—time restored training group, there was an increase of only 4.6%. The increase from the 0% frequency shifted-time restored pre-test to the post-test for the 35% frequency shifted-time restored trained group was 12.5%, and for the 0% frequency shifted-time restored trained group was only 1.3%. The implication which can be drawn from these results is that for these children, the use of frequency shifted—time restored auditory signals may enhance the learning of verbal stimuli in auditory training. Further investigations using this type of manipulated acoustic signal with hearing impaired children were indicated. THE USE OF FREQUENCY SHIFTED AND TIME RESTORED SPEECH WITH HEARING IMPAIRED CHILDREN By Nancy Low Mosher A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF ARTS Department of Audiology and Speech Sciences 1973 Accepted by the faculty of the Department of Audiology and Speech Sciences, College of Communication Arts, Michigan State University, in partial fulfillment of the requirements for the7degree of Master of Arts. ACKNOWLEDGMENTS I would like to express my sincere appreciation to my committee chairman, Dr. Daniel S. Beasley, for his invaluable assistance in the preparation of this thesis. I would also like to extend my gratitude to Dr. H.J. Oyer, Dr. W.F. Rintelmann and Dr. D. Orchik for their many contributions. In addition, I extend my thanks to Mr. R. Weir, Daun G. Beasley, Mr. D. Riggs, Mr. Brian Low and all my subjects for their 000peration and patience. I also wish to acknowledge my relatives and friends who spent hours preparing my stimu- lus plates. I especially wish to thank my husband, Bob, for his countless sacrifices and constant encouragement in helping me achieve my educational goal. ii TABLE OF CONTENTS ACKNOWLEDGEMENTS..................................... ii LIST OF TABLES.......................... ...... ....... LIST OF FIGURES...................................... Vi Chapter I. INTRODUCTION............................... Frequency manipulation.................. Filter Systems........................ Vocoders.............................. Frequency Transposition............... Frequency Shifting Systems............ 1 O'Q-F’KAJ b.) H Temporal Manipulation................... 13 Frequency and Time Manipulation......... l7 Auditory Training....................... 21 Summary and Statement of the Problem.... 23 II. EXPERIMENTAL PROCEDURES.................... 26 Subjects................................ 26 Stimulus Generation..................... 2? Calibration Procedures.................. 30 Presentation Procedures................. 31 III. RESULTS.................................... 39 IV. DISCUSSION................................. 44 Implications for Future Research......... 62 LIST OF REFERENCES................................... 64 APPENDICES Appendix A. Information Form...........................~ 68 B. Information Used to Match Subjects--Control Group and Experimental Group............... 70 iii Appendix C. D. Directions Given to the Subjects Prior to the Testing......... noooooooooooooooooooooo 72 Relation of Average Pure Tone Thresholds for Subjects who Gained and Lost in Per- centage Scores with Respect to Type of Training................................... 73 iv Table 1. LIST OF TABLES Mean pure tone thresholds and calculated speech reception thresholds for all eighteen subjects in dB HTL.................. 32 Mean presentation level in dB SPL for sub- jects trained on 0% frequency shifted-time restored stimuli, subjects trained on 35% frequency shifted-time restored stimuli and the mean presentation level for all subjects. 33 Mean percent correct scores and ranges for the pre-tests and post-tests under 0% fre- quency shifted-time restored and 35% fre- quency shifted-time restored testing conditions for the group trained on the 35% frequency shifted-time restored signal and for the group trained on the 0% fre- quency shifted-time restored signal.......... 40 Figure 1. LIST OF FIGURES Schematic representation of the listening situation and apparatus during stimulus presentation0.00....OIOIOOOOOOOOOOOOOOOOO... Mean item scores for pre-tests, post-tests, and each day of training under the 35% fre- quency shifted-time restored training condition...............................o... Mean item scores for pre-tests, post-tests and each day of training under the 0% fre- quency shifted-time restored training conditionto0..OI.OO...OCOIOOOCOOOIOOOCOOOOOO Individual function of item correct scores per day with 35% frequency shifted-time reStored training........................... Individual function of item correct scores per day with 35% frequency shifted-time restored training........................... Individual function of item correct scores per day with 0% frequency shifted-time reatored trainingooooOOOOuoocooooooon0.00000 Individual function of item correct scores per day with 0% frequency shifted-time restored training........................... vi 3? 44 as 55 57 59 61 Chapter I INTRODUCTION For several decades, man has eXperimented with the al- tered acoustic characteristics of the speech chain. Speci- fically, the temporal and frequency characteristics of the acoustic signal have been manipulated. These manipulated signals have been studied with college students (Beasley and Shriner, 1971), elementary age children (Beasley, Shriner, and Zemlin, 1969), the blind (Foulke, 1966), hearing impaired children (Ling, 1967; Guttman and Nelson, 1968; Zemlin, 1966), and adults (Bennett and Byers, 1967L and deep sea divers (Hollywell and Harvey, 1964). Several forms of acoustically modified signals have been studied relative to intelligibility. That is, inves- tigators have manipulated the frequency or temporal charac- teristics of the speech signal by different degrees, while attempting to retain intelligibility using measures such as the Rhyme test (Voiers, 1968), Harvard Test Sentences (Williams and Hecker, 1968), Fairbanks Rhyme test (Williams and Hecker, 1968), Modified Rhyme Test (Williams and Hecker, 1968), Spondees, Rhyming Words and the Picture Identification for Children--A Standardized Index (Shriner, Beasley and Zemlin, 1969). Manipulations of the speech signal have been shown to affect the energy distribution of the spectrum. Vowels l contain the strongest energy concentrations within speech, and manipulation of speech signals has been shown to alter their energy spectrum. Normative data as to formant energy concentrations for specific vowels spoken by males, females and children were provided by Peterson and Barney (1952). Spectrograms of the vowels from 1520 recorded vowels in an /th/ context were enlarged to find the fundamental and relative formant amplitudes. While the formants for males, females and children were within different frequency ranges, the relationships between formants within a group were consistent. Further, the overtones of the fundamental fre- quency were found to be integral multiples of the fundamental, leading to the conclusion that the relationship between for- mants was logarithmic. Major energy concentrations such as formants can be altered by frequency and temporal manipulations of the acoustic signal. One method used to accomplish this has been to transpose the overtones in the signal to another frequency region, leaving the fundamental intact. The resul- tant signal, however, produced disproportionate frequency shifts relative to the formant structure. Proportionate frequency shifts, on the other hand, permit shifting the entire acoustic spectrum while retaining the relationship between formants. FREQUENCY MANIPULATION Several means of manipulating the speech signal have dealt with changes in the frequency spectrum of the signal. Filtering, vocoding, frequency transposing and frequency shifting have all been used to manipulate the speech signal. Filter Systems Filter systems eliminate part of the acoustic signal. The bandwidth of the filter and the number of filters have been related to intelligibility. Kryter (1960) investigated the preservation of speech intelligibility under several conditions of filtering. He investigated the retention of intelligibility using one 500 Hz wide pass band, two 500 Hz wide pass bands and three 500 Hz wide pass bands respectively. For each condition of filtering, the band or bands were placed at the areas of major energy concentrations. The author found that the frequency region around 500 Hz was essential to intelligi- bility, and that the single band pass system required two times the effective bandwidth as the best three band system- to produce equal intelligibility of PB words and sentences. The filtering system used above filtered out part of the acoustic energy of a signal. Other researchers have investi- gated the use of filters in vocoders. Vocoders A vocoder is a "voice coder" and Operates on the assumption that the ear is insensitive to phase differences. Vocoders consist of a series of band pass filters which filter part of the speech signal energy, perform some manipulation upon that filtered energy, and then resynthe- size the signal by another series of filters. The set of input analyzing filters may pass only part of the energy of the signal depending on the width and center frequency of the band pass filters. Formant vocoders have filters at areas of major energy concentrations, while channel vocoders consist of adjacent band pass filters. Different types of vocoders have been used to manipu- late the acoustic signal. Pimonow (1962) eXplained a synthetic telephone vocoder as eliminating information not necessary for intelligibility, and transposing necessary information from its natural place in the auditory spectrum to bands of frequency available to the neurologically damaged ear. Vilbig and Haase (1956) described three types of vocoder systems. The envelOpe extraction and transmission system consisted of 100 filters used for frequency analysis. In the pulse vocoding system, the spectral envelOpe was transformed into a pulse sequence with the width of each pulse corresponding to the width of a formant. The formant vocoding system consisted of formant extraction via a procedure using peak voltage to provide frequency moments, and then differentiating this information for resynthesis purposes. Gold (1965) designed a vocoding system which used both a channel vocoder and a formant vocoder. The formant vocoder, which had the potential for recreating the proper phase and amplitude of the vocal tract spectrum, synthesized vowels and glides, whereas the channel vocoder synthesized conso- nants. Gold and Radar (1967), however, listed six precau- tions when evaluating or comparing vocoding systems; (1) A male speaker with good diction, but little intensity and intonation change, may cause a system to be perceived as sounding better than it actually is; (2) Listening to the vocoded stimuli through good earphones may cause it to be perceived as better than if listened to through loud- speakers; (3) The signal may be distorted by reverberation produced in a large room; (4) Prior knowledge of the speech stimuli may result in improved performance; (5) The quality of the output signal should be judged on the intelligibility with respect to an unmanipulated signal; and (6) The acous- tic characteristics of different sentences and vocal char- acteristics of speakers affect judgements on vocoding systems. Golden (1963) described a ten spectrum channel voice excited vocoder (VEV) which was simulated from an IBM 7090 digital computer. The VEV permitted a means of studying excitation and analyzing the frequencies within a filtered bandwidth. The practical application of the VEV was limited because it required 172 seconds to analyze and resynthesize one second of recorded speech. Voiers (1968) compared the intelligibility of eight digital vocoders using a Diagnostic Rhyme Test. He found vocoders most deficient in use with sustensive phonemes, that is, fricatives and plosives. The vocoders were also deficient with respect to the "grave" feature of phonemes which was considered to be dependent upon the nature of the second formant transition. Williams and Hecker (1968) compared scores on the Harvard PB words (Hirsh, 1952), Fairbanks Rhyme Test (Fairbanks, 1958), Modified Rhyme Test (House, 1965), and the Harvard Test Sentences (Egan, 1958) under three condi- tions of distortion. These three conditions were; additive spectrum shaped noise, peak clipping and vocoderization. Results demonstrated that for additive spectrum shaped noise and vocoderization. sentences had the highest intelligibility curves, followed by the Rhyme tests and the PB words, respec- tively. For clipped speech, intelligibility curves were highest for the two Rhyme tests, followed by the sentence and PB words, respectively. Schroeder (1966) described different methods of vocoding speech signals, including; formant vocoders, b) pattern matching vocoders, c) correlation vocoders, d) the Laquerre Expansion Vocoder, e) "High Fidelity" vocoder, f) speech coding by frequency division, g) separation of the spectral envelope, and h) digital computer simulation. The author concluded that many of the problems of "naturalness" have been overcome although some of the devices continued to sound like a person speaking with an accent. Vocoders, then, have been used to filter, manipulate, and resynthesize the acoustic signals. Few investigations, however, have been carried out to determine their effect on intelligibility and comprehension. Also, the vocoders tend to be very costly, thus limiting the pragmatic use. Frequency Transpositign With frequency transposition, the fundamental frequency is retained, while the overtones are filtered out of the signal and "re-entered" into the signal at a lower energy level. A two channel transposer was described by Johansson (1966). One channel of amplification operated directly on the normal speech frequency range while the second channel separated voiceless consonants from the rest of the signal via high pass filters. The voiceless consonants were then transposed down to a lower frequency range. ‘A group of hearing impaired children (n26) with low tone residual hearing were given training on stimuli consisting of two syllable rhyming word pairs. The stimulus words were pre— sented over the direct channel only prior to the training. Next, a pre-test with the transposed voiceless consonants was given. Ten days of training at twenty minutes per day were given using the transposed signal. Finally, post tests were given over the direct channel. All subjects showed increasing scores as a result of the training with the trans- posed signals. The authors reported that subjectively the children preferred the transposed signal. Ling (1968) reported on three experiments utilizing frequency transposition, all of which attempted to compen- sate for the limited frequency range of the children (aged 8-14 years) with low frequency residual hearing., In the first eXperiment, the LingrDruz Vocoder (LDV) provided conventional amplification from 70 Hz to 700 Hz, and trans- posed the 2000 Hz to 3000 Hz frequency range down to the 750 Hz to 1000 Hz range. Two presentations of each of the five discrimination tests were given to the eight subjects prior to seven hours of training using materials similar to the test material. Four subjects received training on the LDV while the other four subjects used a speech training aid which exhibited a smooth frequency response from 60 Hz to 6000 Hz at 60 dB. After the training, two discrimination tests were given as post-tests. Although both groups made progress on the instrument to which they were assigned for training, the group trained using the LDV did not show improved discrimination scores when compared to those trained using conventional amplification. The second experiment compared the Johansson-Wedenberg Transposer (JWT. Johansson and Wedenberg, 1960) and the Ling-Druz Vocoder (LDV). Eight hearing impaired children (aged 6-14 years) were divided into two groups corresponding to the two auditory systems (JWT and LDV) used. Each system provided amplification with and without the transposition analog. Three discrimination tests of words familiar to the children were recorded by a female speaker. Four sub- jects received pre-tests followed by training using the analog on each frequency transposition system. The training period consisted of forty minutes per day for ten days. Post-tests were given on the instrument used for training both with and without the analog. The four conditions for post testing were (1) LDV with the analog, (2) LDV without the analog, (3) JWT with the analog, and (4) JWT without the analog. Within a counterbalanced design, subjects then went through the same procedure on the instrument they were not trained on the first time. The author reports no sig- nificant difference between the scores obtained under any of the four conditions. Performance was not differentially affected by the presence or absence of analog signals or the use of either instrument. The third experiment added twelve hours of training to the six hours received earlier on each of the two transposing instruments. Differences in scores still did not reach significant levels. Ling and Doehring (1969) used the LDV under three 10 experimental conditions; (1) conventional amplification to both ears (linear channel of the LDV), (2) coded speech to both ears, and (3) conventional amplification to one and coded speech to the other. Twenty-four profoundly deaf children (aged 7-11 years) participated in this study of a programmed associative learning task. Transposition of speech through coding did not improve the discrimination of consonants by the deaf children tested. Further, Guttman and Nelson (1968) reported that, using Harvard sentences and PB words, hearing impaired children receiving a frequency transposed signal scored the same as a control group of hearing impaired children. Thus, to date, frequency transposition, whereby a por- tion of the acoustic spectrum is shifted to another frequency range, has been shown to produce no significant increase in intelligibility with the hearing impaired when compared to conventional amplification. Frequency ShiftinggSystems Frequency shifting systems have involved shifting all of the formants of the acoustic signal up or down by a certain number of cycles. The entire energy spectrum is preserved, but the relative relationships between formants is distorted. Bogart (1956) described a vocoding system (Vobanc) which used frequency shifting procedures. In this system, the speech signal in each band was passed through a 11 modulator to shift the frequency band downward. If the frequency range of formants varied within 800 Hz at the input then it varied over 400 Hz at the output. Consonant articulation tests were used with forty-eight listeners and ten talkers to compare the Vobanc speech with direct Speech transmitted through a 200 to 1700 Hz band pass filter. The Vobanc speech articulation scores were 14.1% higher than the narrow band direct speech scores. Takefuta and Swigart (1968), using a different method of frequency manipulation, presented twenty-eight consonant- vowel-consonant (CVC) words to eighty hearing impaired subjects divided into four groups. Each group received a different condition of frequency shifting (1.0, 0.7, 0.5, and 0.4). All subjects had a fifteen minute training ses- sion under conditions of no spectral compression. The results of this study showed that intelligibility decreased with the amount of spectral compression (frequency shifting). Corliss et a1. (1968) "split and squared” the acoustic signal. For certain experimental conditions, certain dis- tortion components were cancelled and others enhanced. Temporal relationships were altered by fast and slow speed playback. Linear combinations of distorted and undistorted tracts were presented to thirty subjects. Results showed that lowering the "Characteristic" frequency proved to have the most serious effect upon intelligibility. The next most serious effect was stretching the transition time. Finally, whereas reducing time to half the initial value proved 12 relatively trivial, doubling the frequency by second harmonic distortion was significantly damaging. Raymond and Proud (1962) studied the use of a frequency converter in aural rehabilitation with twenty-seven adult patients with high frequency hearing losses. The frequency converter simply shifted the input signal down the frequency scale by 400 Hz. The CID W-22 word lists (Hirsh, 1952) were used for intelligibility measurements and sixteen hours of training were given to each subject. Prior to the post- testing, a taped story which was frequency shifted by 750 cycles, was read to acquaint the subjects with listening to the shifted signal. Post testing consisted of presentation of the unconverted PB words, then the PB word lists were presented using a male speaker under a 400 cycle shift, and a female speaker under a 750 cycle shift. Nearly all subjects improved as a result of training, but none approached the discrimination scores with the unconverted PB word lists. Further, an analysis of the data showed that the harmonic relationships of a "low voice" Were greatly altered during frequency conversion. The authors, above, have described results which indi- cate that the use of frequency shifting does maintain intel- ligibility. The retention of the acoustic energy present in the undistorted signal may be responsible for this. However, such shifting does not preserve the relative relationship between the areas of energy concentration, 14 the signal with that of speech that had been altered by the "chop-splice" method. Intelligibility scores for spondees for both types of time compressed signals at 1.75, 2.0 and 2.5 times the original speed were obtained. Significant differences were found between intelligibility scores for the two methods of time compression at all three speeds. The difference in intelligibility scores were 5%, 30% and 83% better for the "chop-spliced" speech than for the speeded speech at 1.75, 2.0 and 2.5 times the original speed respectively._ At increasingly faster rates of presentation, the intelligibility of the "chop-splice" speech was better than for the speeded speech. Garvey attributed this to the frequency shift that accompanies the "speeding" of speech. The manual "chop-splice" technique was used by Beasley and Shriner (1973) who investigated the effects of covarying word duration and inter-stimulus intervals of sentential stimuli. One hundred and twenty normal hearing college stu- dents were assigned to one of twelve listening conditions. The experimental conditions were combinations of three levels of word duration (400, 300, 200 mess) with four levels of silent interstimulue.intervals (400, 300, 200, 100 msec). First and second order sentential approximations to full grammatical sentences were recorded by a male phonetician. Results showed that auditory perception was susceptible to temporal influences and these influences were more related to stimulus duration than to interstimulus interval, although interstimulus interval did have some 15 influence on the subjects' scores. Another method of changing the temporal characteristics which does not affect the frequency characteristics of the signal was developed by Fairbanks. Everitt and Jaeger (1954). Temporal redundancy was dealt with by eliminating part of the signal. The procedure involved a tape divided into time intervals, with even intervals discarded (Id) and sampling intervals (IS) retained. One hundred times the discard interval divided by the sum of the two intervals equaled the percent of compression (that is, % compression = 100 x _g;g_). Original recordings showed that small values of compresjsfiiorfB had little effect on intelligibility and the perceived speed of speech. Varying the discard intervals was studied by Fairbanks and Kodman (1957). Eight young adults listened to fifty familiar PB monosyllables presented binaurally through earphones at 80 dB SL. Results showed that as the discard interval increased, intelligibility decreased. The authors concluded that with a discard interval of 16 msec high percentage intelligibility scores were retained. Fairbanks, Guttman and Miron (1957) carried out three experiments dealing with comprehension as related to degree of time compression using message sets which were time compressed. In the first experiment a short version and a long version of a written passage were presented at time compression ratios of 0% and 30%. Additions, paraphrases and restatements, which occured in the passage, were added 16 into the total ”time ' saved by compressing the signal. Results of a comprehension test indicated that these addi- tions, paraphrases and restatements increased responses to selected portions of the message, but were accompanied by decreased responses to other portions and showed no change in the overall response. In the second experiment, double messages were presented using a message time that was 50% time compressed. A very small increase in comprehension test score was observed with the double presentation. In a third experiment, five conditions of time compression (0%, 30%, 50%, 60%, 70%) were used with five message sets. Thirty-six subjects listened to the message sets and were then tested for comprehension. Message efficiency, as determined by the amount of factual comprehension per sti- mulus time, increased when the message sets were compressed up to 50%. 70ther investigators have linked duration of the signal to other characteristics of speech. Beasley, Schwimmer, and Rintelmann (1972) studied the effects of varying degrees of time compression upon speech intelligibility. Ninety-six normal hearing subjects were placed in six groups of sixteen subjects each. Each group received a different percentage of compression (0%, 30%, 40%, 50%, 60%, 70%) of the NU Auditory Test #6, Form B, (Tillman, Carhart and Wilbur, 1963). Four lists of the test were presented at each com- pression level under four different sensation levels (8, 16, 24, 32 dB). The results showed that as time compression 17 increased, intelligibility, as measured by the word lists, decreased. However, time compression had a minimal effect on scores until the 40% condition when a gradual decrease in intelligibility began to be apparent which continued until the 70% condition. At the 70% time compressed condi- tion, an appreciable decrease in intelligibility was noted. Garvey (1963), Fairbanks et a1.(l954), Beasley and Shriner (1971), and Beasley, Schwimmer, Rintelmann (1972) reported on studies with temporal manipulation of the speech signal. Variance of both the temporal and frequency char- acteristics provide another method of manipulating the speech signal. FREQUENCY AND TIME MANIPULATIONS Slow or fast playback of a recorded signal as a means of signal modification was discussed by Garvey (1963). He suggested that the frequency change accompanying a change in speed was responsible for lowered intelligibility of the signal. Slow played speech expands the time element and prOportionately lowers the entire frequency spectrum. This frequency distortion (shifting) technique is similar to that used by Corliss (1968), Raymond and Proud (1962), Bogart (1956), and Takefuta and Swigart (1968), except that the slow play method preserves the relative relationships between energy concentrations with the frequency shifts. Kurtzrock (1957) presented 50 monosyllabic words under 18 six conditions of time and frequency distortion. The author found that intelligibility was more sensitive to frequency division or multiplication while retaining normal time, than to the same amount of temporal distortion with frequency held constant. Further, vowels were found to be less in- telligible than consonants in the frequency manipulated signal, but were more intelligible than consonants in the time distorted condition. Slow playing of the speech signal was used by Bennett and Byers (1967) as a means of proportionate frequency shifting. Fifteen adults (mean age 67 years) who exhibited high frequency sensorineural hearing impairments were given the Fairbanks Rhyme test, which had been modified by slowing the speed of the tape playback by 100%, 90%, 80%, 70% and 60% of normal speed. Slowing normal speech to about 80% resulted in improved intelligibility, but beyond 80% (that is, 20% frequency shifted) intelligibility decreased. The authors postulated that proportionate frequency shifts should have more ”carryover" to non-altered speech than other methods of frequency manipulation, since the manipu- lated signal is only a relative change from normal speech. Zemlin;(l966) presented normal and frequency lowered- time restored /hV / words to hearing handicapped children (aged 7-13 years) in an attempt to determine if training would result in an increase in scores. The group mean for the undistorted signal was 55%. The initial frequency l9 lowered-time restored words yielded a score of 16% on the first trial and improved to 38% by the fourth and final ‘ trial. Therefore, a 16% increase in score was obtained through the brief period of training. Vowel intelligibility was studied with respect to degrees of time compression and frequency division with and without time restoration using male and female speakers (Daniloff, Shriner, Zemlin,1968). Vowels in an /hV / con- text were presented to twenty college students under different percentages of distortion involving time compressed (TC), frequency divided-time restored (FD-TR), and frequency divided-time distorted (FD-TD) conditions. Time compressed vowels were more intelligible than vowels processed under either condition of frequency distortion. Further, female speakers were found to be more intelligible than males for all conditions of distortion. Results also showed that restoring time to normal while dividing frequency improved intelligibility only slightly. In another study, (Beasley, Shriner and Zemlin, 1960), The Picture Identification for Children--A Standardized Index (PICSI), a measure of speech intelligibility, was tape recorded by a female speaker. This tape recording was then frequency divided by 35% (FD-TD), frequency divided by 35% with time restored (FD-TR), and frequency divided with time compressed by 35% (FD-TC). Fifty school age children with normal hearing were divided into three experimental groups. 20 After receiving the PICSI under normal conditions, each group received one of the experimental conditions described above. The results showed that restoring time to the fre- quency distorted signal did not significantly affect intelligibility. Related to frequency shifting as well as pitch are the acoustic characteristics of speech in the helium atmosphere of undersea labs. The formant frequencies of the acoustic signal under such conditions have been altered (Hollywell and Harvey, 1964; Cooke and Beard, 1966) via the slow playback method, and intelligibility has been shown to in- crease with practice (Hollywell and Harvey, 1964). Spec- trograms of helium speech were constructed (MacLean, 1966) which showed that; (l) The formant shifts were responsible for the change in the quality of helium speech; (2) These shifts were non-linear; (3) The acoustic energy of frica- tives was shifted upward; (4) The pitch of the fundamental frequency did not change significantly; and (5) After se- veral days, the change in quality appeared to increase in "naturalness". Beil (1967) reported that the ratios between formants remained constant in their spectrograms of vowels produced in helium. He postulated that the change of fre- quency in helium speech was due to a change in sound velo- city in the medium filling the vocal cavity. Stover (1967) reported that the pitch was unaffected by helium; however, the formants were raised by a constant ratio. He further 21 noted that there appeared to be prOportional time compres- sion of the helium waveform within the pitch interval. One of Stover's three alternative solutions to these changes was lowering the frequencies by simply slow playing the signal, thereby returning it to a more natural sounding’quality. Although this procedure distorted the temporal characteris- tics of the signal, intelligibility was improved. AUDITORY TRAINING Most hearing impaired individuals have some residual hearing (Goldstein, 1914). People with sloping, high fre- quency sensorineural hearing losses typically retain resi- dual hearing potential in the low frequencies. Listening- skills must be developed to maximize usage of such residual hearing. The educational process of auditory training with preschool and school age children is necessary in order that the child may experience as normal a cognitive process as possible (Oyer, 1968). In the 1920's, with the advent of electronic amplifying devices. interest in the training of residual hearing in- creased. For example, after World War II, the Veterans Administration implemented aural rehabilitation programs which included such training. Hudgins (1953, 1954) reported a longitudinal study using three groups of children (n=10) with mild (55-86 dB) and profound (90-98 dB) hearing losses. After six years 22 of auditory training, all three groups showed a gain in per- formance as measured by visual speech perception and aural/ visual perception tasks. In addition, the profoundly deaf students showed better scores with the combined auditory and visual method than with the visual stimuli alone. Modified acoustic signals may exhibit potential for use in auditory training with hearing impaired children. If the high frequency components of an acoustic signal can be shifted to the range of usable hearing, the child may be able to re- ceive increased acoustic information. That is, a certain degree of the redundant and non-redundant information con- tained in the high frequencies can be presented at frequen- cies within the range of low frequency residual hearing of the child. If the child is able to decode this new infor- mation, finer auditory discrimination may be possible. A major goal in language habilitation for the deaf, then, could be acquisition of language through the auditory avenue as an improved secondary receiver, or even, perhaps, as a primary receiver. Thus, shifting the entire frequency spectrum proportionately, and maintaining the relative differences between formants, places more of the acoustic signal within the residual hearing range of individuals with high frequency sensorineural hearing losses character- ized by low tone residual hearing. Further, by restoring the time element back to normal through time compression, certain of the prosodic characteristics of the original 23 signal are retained and the "naturalness" of the signal approaches normal. SUMMARY AND STATEMENT OF THE PROBLEM In summary, a review of the literature suggests that vocoding, filtering, disproportionate frequency shifting and frequency transposition have been used to manipulate the frequency characteristics of speech signals, but with minimal success in maintaining intelligibility and improv- ing comprehension. Frequency shifting by slow playback methods, which pr0portionately shifts the entire spectrum of acoustic energy while retaining the relative relation- ship between the formant regions, has been shown to have minimal effects on intelligibility. In addition, it has been shown that the Fairbanks electromechanical method of time compression can be used with the proportionate frequency shifting method to restore the temporal charac- teristics of the signal. This study provided information as to the merits of auditory training with the hearing impaired child with little residual hearing. The purpose of the present in— vestigation was to determine if hearing impaired children perform better when auditorily presented with speech sti- muli under conditions of 35% frequency shifting with time restored or under conditions of 0% frequency shifting with time restored. Further, the effects of fifteen days of 24 intensive auditory training for subjects with low frequency residual hearing was investigated with respect to its effect on the test scores. Specifically, the following questions were investigated relative to the responses of hearing impaired children; 1. For the zero percent frequency shifted-time restored testing condition, would there be a change in intelligibility scores from the pre— test to the post-test for hearing impaired chil- dren who received training under the zero percent frequency shifted-time restored condition? For the zerohpercent frequency shifted-time restored testing condition, would there be a change in intelligibility score from the pre- test to the post-test for the hearing impaired children who received training under the thirty- five percent frequency shifted-time restored condition? For the thirty-five percent frequency shifted- time restored testing condition, would there be a change in intelligibility score from the pre- test to the post-test for the hearing impaired children who received training under the zero percent frequency shifted-time restored condition? For the thirty-five percent frequency shifted- time restored testing condition, would there be 25 a change in intelligibility score from the pre- test to the post-test for the hearing impaired children who received training under the thirty- five percent frequency shifted-time restored condition? Chapter II EXPERIMENTAL PROCEDURES Eighteen children with low frequency residual hearing were presented taped randomizations of words taken from the Word Intelligibilityby Picture Identification Test (WIPI) (Ross and Lerman, 1970). The subjects were divided into two matched groups with respect to the condition of training they were to receive. Both groups of subjects received pre-tests and post-tests under both the 0% fre- quency shifted-time restored condition and the 35% fre- quency shifted-time restored condition. During training sessions the speech stimuli were presented to the control group at 0% frequency shifted-time restored, and to the experimental group at 35% frequency shifted-time restored. assess Eighteen subjects, enrolled in a special program for the hearing impaired in a local school district, were selected for study according to current audiometric data. All‘subjects exhibited sensorineural hearing losses char- acterized by low tone residual hearing not extending over 2000 Hz at 110 dB (re ANSI. 1969). Subjects ranged in age from 6.4 to 9.8 years, with a mean age of 8.0 years. 1 All subjects had been in special education classes for 1 Subject D.0. was in a regular classroom for two years and in special education classes for one year. 26 27 the hearing impaired for a period of 4-6 years. subjects were divided into two groups of nine each and were matched on the following criteria: (1) hearing loss as shown by Pure Tone thresholds, (2) age, (3) academic achievement as measured by reading and math levels, and (4) teacher and parent information concerning subjects' use of residual hearing (see Appendices A and B for the informa- tion form and data used to match the two groups). Stimulus Generation Four picture plates were randomly selected from 22 of the 25 plates of the WIPI which did not contain ambig— uous items (Sanderson and Rintelmann, 1971). Each of the four stimulus plates contained four stimulus words and two foils. All six words of each of the four plates were used as stimulus items. Three items from two additional plates were randomly selected as practice items. Each of the twenty-four items and three practice items were tape re- corded in the context of the carrier phrase, “Mark the word _____', by a trained white female phonetician who spoke general American English, at normal conversational speech and effort levels. The speaker was seated in a single-walled IAC test booth with a Volume Unit meter in front of her to monitor her vocal intensity level. The complete phrases were recorded at 7% ips onto a Uher Type #000 tape recorder via a M 516 microphone. The master tape recording was played back on an Ampex 858 28 tape recorder-player and each stimulus phrase checked to see that all portions were between -3 and 0 on the VU meter. A Hewitt and Packard 4204-A Oscillator was used to generate a 1000 Hz calibration tone, which was spliced to the beginning of the tapes. Graphic level recordings of the taped stimuli were made to determine the level of the calibration tone. The resulting calibration tone and sti- muli were played back via the Ampex 858 tape deck through a B & K 2305 Graphic level recorder using QP 1102 paper. A recording of the master tape was made from the Uher Type h000 tape recorder to the Varispeech I time compressor. A Beckman 6148 Eput Timer Counter was set to respond to a 100 Hz calibration tone played through the Varispeech I to the Counter. The frequency shifts for a 100 Hz tone with respect to the 35% frequency lowered procedure and the 35% time compressed procedure, respectively, were cal- culated (amountoghanged = lgngz). compression dial on the Varispeech I changed until the Then the percentage calculated frequency was shown on the Counter. With the dial setting for the 35% frequency lowered condition and the pitch preservation computer turned off, the stimulus words were taped from the Varispeech I to the Uher Type 4000 tape recorder. The resulting time expanded, frequency lowered stimulus tape was then copied onto the Varispeech I. The dial setting on the Varispeech was set at 35% time compressed and the pitch preservation computer was activa- ted to provide a frequency shifted-time restored signal. 29 Five copies of this tape were made, resulting in one master experimental list and four training lists. All five lists were re-checked on the graphic level recorder to insure that the calibration tone remained correct after the mania pulation. Then, the words in all five experimental lists were manually "chop—spliced" into randomized orders. These tapes were labeled Master EXperimental Tape (MET) to be used for pre-testing and post-testing, and MEl, MEZ’ ME3° ME“ used for training. To make the 0% frequency shifted-time restored tapes, the master tape was played from the Uher Type 4000 tape recorder to the Varispeech I time compressor. The dial on the Varispeech I was set at zero and the tape was copied from the Varispeech I to the Uher. The tape was then recopied from the Uher to the Varispeech. The pitch pre- serving computer on the Varispeech I was turned on, with the dial set at 0% and the tape was played to the Uher and copied five times. Each step in making the 0% frequency shifted-time restored tapes was identical to that for making the 35% frequency shifted—time restored tapes, ex- cept that the dial setting on the Varispeech I was set for zero shifting and zero compression. The word order of the first list was left in the original random order. The word order of the other four lists were spliced in new random orders. The original randomized list of words was labeled the master Control Tape (MOT) and used for pre-tests and 30 post-tests. The four other randomizations of the word list were labeled MCI, MCZ, MC3’ MC“ and used for training. The levels of these lists were checked on the Graphic Level Recorder to ascertain whether the intensity levels had remained consistent with the calibration tone. Response booklets to be used during the eXperiment were constructed and consisted of dittoed sheets of colored pictures which corresponded to the plates of the WIPI. The sheets of colored pictures were covered with contact paper to allow the children to mark on the booklets directly. A booklet was made to correspond to each of the ten taped lists. Two sets of color slides were taken using a 35 mm slide camera and the WIPI plates. These slides were used as part of the training procedure. Calibration Procedures The TDH-39 earphones housed in MX-ul biscuit type cushions were connected to the output of the Ampex model AG 600 tape recorder-player (65 to 10,000 Hz) and the out- put intensity was read on the B & K model 2204 sound level meter using the #145 condensor microphone and a 6cc coupler. While playing the stimulus tape on the recorder, the inten- sity level was constant at 60 dB SPL when read on the C-scale at the slow position on the sound level meter. An Ampex 620 amplifier-speaker was then connected between the Ampex model AG 600 tape recorder-player and the TDH-39 31 earphones. The attenuator dial on the Ampex 620 amplifier- speaker was then calibrated and marked in five dB increments from 80 dB to 135 dB SPL. PresentationZProcedureg Presentation levels for the experimental stimuli for each subject were determined from audiometric data obtained within the previous 120 days (see Table l). Pure-tone or ' warbled-tone averages were used to calculate Speech Recep- tion Thresholds (SRT) using the method described by Carhart (1971) for marked high tone losses. Twenty dB was added to the calculated SRTs to convert them to Sound Pressure Level (SPL). The stimuli were presented 2% dB above the SRT whenever possible. Actual presentation levels were determined by playing the practice items beginning approx- imately 25 dB below the predicted level and increasing the level in 5 dB steps. The children were asked to tell if it ”tickled or hurt“ or was ”okay”. Actual levels were as close to the predicted levels as possible with no level exceeding 135 dB SPL (re the calibration procedure dis- cussed above). As shown in Table 2, mean presentation levels for the group who received the 0% frequency shifted-time restored training and the group who received the 35% frequency shifted—time restored training were similar. The mean cal- culated presentation level for the group trained under the 0% frequency shifted-time restored condition was 137.3 dB SPL, 32 TABLE l.--Mean pure tone thresholds and calculated speech reception thresholds for all eighteen subjects in dB HTL. Pure Tone Thresholds (Hz) Calculated 125 250 500 1000 2000 SRT Mean Intensity 7o 81 91.5 102.5 101.5 11u.16 (dB HTL) (n=5) (n=15)(n=l7) (n=18) (n=13) (n=18) 33 TABLE 2.—-Mean presentation level in dB SPL for subjects trained on 0% frequency shifted-time restored stimuli, subjects trained on 35% frequency shifted-time restored stimuli and the mean presentation level for all subjects. Presentation Level in dB SPL Range of Calculated Actual Difference Difference Training at 0% frequency shifted- 137.3 130.3 7.0 0-15 time restored (n=9) Training at 35% frequency shifted- 1#0.3 127.2 13.1 0-26 time restored (m9) All Subjects 138.9 129.0 8.# 0-26 (n218) 34 for the group trained under the 35% frequency shifted- time restored condition, the mean presentation level was 140.3 dB SPL. The average calculated presentation level for all subjects was 138.9 dB SPL. The actual presentation level for the group receiving training under the 0% fre- quency shifted-time restored condition was 130.3 dB SPL, and was 127.2 dB SPL for the group receiving the 35% fre- quency shifted-time restored training. The mean actual presentation level for all subjects was 129 dB SPL. The difference between the calculated and actual presentation levels for the group trained under the 0% frequency shifted- time restored condition was 7.0 dB with a range of 0 to 15 dB SPL. The difference between the calculated and actual presentation levels for the group trained under the 35% frequency shifted-time restored condition was 13.1 dB with a range of 0 to 26 dB. The average difference for all subjects was 8.4 dB with a range of 0 to 26 dB. All subjects were pre-tested and post-tested with the 35% frequency shifted-time restored and the 0% frequency shifted-time restored WIPI stimulus items. Order of pre- sentation of the pre-tests and post-tests was the same for each subject but counterbalanced between subjects. Subjects were divided into two groups. The experi- mental group (n=9) received training on the 35% frequency shifted-time restored tape (MEI, MEZ' ME3, ME“). the con- trol group (n=9) received training on the 0% frequency shifted-time restored tape (MCl, M02, M03, MC“). Order 35 of presentation of training lists was randomized. The train- ing period consisted of twenty minutes each day for fifteen days (total: 300 minutes). All conditions of tapes were presented to the subjects via the Ampex model AG 600 tape recorder-player through an Ampex 620 amplifier-speaker coupled to TDH-39 earphones housed in MX-4l/AR biscuit type cushions. A Kodak carousel slide projector and projection screen were used to present the visual stimuli during the training sessions. Figure 1 shows a block diagram of the equipment. The rooms used for testing provided minimal visual distractions and low noise levels. Noise measurements at the level of the subjects ear were made using the C-scale of the B & K 2204 sound level meter set on the slow response position coupled to a 4145 condenser microphone, and were found to range from 64 dB to 74 dB SPL. Instructions were given to all subjects in a method appropriate to the individual child (see Appendix C). Each child was verbally reinforced on a random basis. Child- ren were reminded to listen carefully if their attention seemed to wander. Training sessions were divided into two parts. In the first part, a randomized list of stimulus words was presented to the subject through the earphones. As each item was pre- sented aurally, the subject viewed a slide of the same word which was projected onto a screen several seconds prior to the aural stimulus. After hearing the item, the subject was expected to repeat the word. If the word was spoken incor- rectly by the subject, the experimenter repeated the word for 36 FIGURE l.--Schematic representation of the listening situa- tion and apparatus during stimulus presentation. Ampex Model 37 Ampex Model screen AG 600 tape Amplifier- recorder- )' Speaker player E 4* l TDH-39 cggdak EXperimenter ousel Earphones Listener Slide projector Projection 38 the child to imitate. The second part of each training ses— sion consisted of listening to the same randomization of the words. This time the subject was expected to respond by circling the apprOpriate picture in the response booklet. The eXperimenter recorded the correct word and the indivi- dual's response on the subject response forms. Thus, there were twenty-four possible correct responses, and the number correct for each subject out of twenty-four possible items was recorded. Chapter III RESULTS Increases in scores from pre-testing to post-testing were observed under both training conditions. Results showed that subjects who received the 35% frequency shifted-time restored training scored significantly higher on both the 0% frequency shifted-time restored post-tests and the 35% fre- quency shifted-time restored post-tests. Subjects who received the 0% frequency shifted-time restored training showed only slight improvements on the 0% frequency shifted- time restored post-tests and the 35% frequency shifted-time restored post-tests. Table 3 shows the percentage correct scores for pre- tests and post-tests for both conditions of testing (0% and 35%) and for both training groups (35% and 0%). The group of children who received the 35% frequency shifted-time restored training showed an increase in average percentage correct scores from 20.8 on the 35% frequency shifted-time restored pre-test to 35.1% on the 35% frequency shifted-time restored post-test, resulting in a 14.3% increase. The mean percentage correct score for the same subjects, trained with the 35% frequency shifted-time restored tapes, showed an in— crease from 22.2% on the 0% frequency shifted-time restored pre-test to 34.7% on the 0% frequency shifted-time restored post-test, resulting in an increase of 12.5%. Thus, for the group who received training on the 35% frequency shifted-time restored tape, improvements in percentage correct scores of 39 40 TABLE 3.—-Mean percent correct scores and ranges for the pre-tests and post-tests under 0% fre- quency shifted-time restored and 35% fre— quency shifted-time restored testing condi— tions for the group trained on the 35% frequency shifted-time restored signal and for the group trained on the 0% frequency shifted-time restored signal. EXPERIMENTAL TRAINING 35% frequency shifted- time restored CONTROL TRAINING 0% frequency shifted- time restored 0% frequency shifted- time restored pre-tests and post-tests Pre-test 22.2 (803'14'508) Post-test 34.7 (1205-83’5) Difference 12.5 Pre-test 29.6 (12.5-58.3) Post-test 31.0 (803‘8705) Difference 1.3 35% frequency shifted- time restored pro-tests and post—tests 20.8 (803'24) 35-1 (0-91-7) 14.3 19.4 (8.3-45.8) 24.0 (803-6205) 4.6 41 14.3% and 12.5% were found on the 35% and 0% frequency shifted- time restored tests, respectively. Referring again to Table 3, the mean percentage correct scores for the group of subjects receiving training on the 0% frequency shifted-time restored tape, showed a change from 19.4% on the 35% frequency shifted-time restored pre-test to 24% on the 35% frequency shifted-time restored post-test, re- flecting a 4.6% difference. The mean percentage correct scores for this same group showed an increase from 29.6% on the 0% frequency shifted-time restored pre-tests to 31% on the 0% frequency shifted-time restored post-test, an increase of 1.3%. Thus, the mean percentage correct scores for the group trained on the 0% frequency shifted-time restored condi- tion showed an improvement of 4.6% and 1.3% for the testing conditions of 35% and 0% frequency shifted-time restored, respectively. Comparison of the mean percentage correct scores on the 0% frequency shifted-time restored pre-tests and post-tests for both training groups show that both groups increased. However, while the group trained with the 0% frequency shifted-time restored condition demonstrated an increase in mean percentage correct score from 29.6% to 31.0% (an increase of 1.3%), the group trained on the 35% frequency shifted-time restored condition demonstrated an increase from_22.2% to 34.7% (a 12.5% increase). Mean percentage correct scores for both training groups show an increase from pre-test to post-test with the 35% frequency shifted-time restored tests. The 35% frequency 42 shifted-time restored trained group mean percentage correct scores show an increase from 20.8% on the 35% frequency shifted-time restored pre-test to 35.1% on the 35% frequency shifted-time restored post-test (a 14.3% increase). The mean percentage correct scores for the group with training on the 0% frequency shifted-time restored condition, increased from 19.4% to 24.0% between the 35% frequency shifted-time restored pre-tests and post-tests, resulting in a 4.6% increase. Figure 2 depicted the average increase in scores with respect to days of training for the group trained under the 35% frequency shifted-time restored condition. For both the 0% frequency shifted-time restored testing and the 35% fre- quency shifted-time restored testing, the mean items correct scores between pre-testing and post-testing, for subjects trained under the 35% frequency shifted-time restored condi- tion, showed an increase. Figure 2 also showed the average item increase to be 3.4 items between pre-testing and post- testing on the 35% frequency shifted-time restored tests and an average item increase of 3.01 between the pre-testing and post-testing on the 0% frequency shifted-time restored tests. Figure 3 showed the change in mean scores with respect to days of training for the group who received the 0% fre- quency shifted-time restored training conditions. The change in scores as related to days of training appeared to fluctuate around a mean of 7.05 items correct. The mean number of items correct changed from 7.11 to 7.44, a difference of .33 between the 0% frequency shifted-time restored pre-tests and 43 FIGURE 2.--Mean item scores for pre-tests, post-tests, and each day of training under the 35% frequency shifted-time restored training condition. 44 28- 2G 24 22 20 IS Ifl'1i1'1‘l' I6 l4 l0 ITEMS CORRECT-MEAN SCORES (N=9) ES ' T ' I fl N «I5 am IIY' llllll+lllllLllJ| 0 gym 2 4 6 8 IO l2 mitt." TRAINING DAYS 357. FREQUENCY SHIFTED-TIME RESTORED 45 FIGURE 3.--Mean item scores for pre-tests, post-tests, and each day of training under the 0% frequency shifted-time restored training condition. ITEMS CORRECT-MEAN SCORE (N=9) —..__~NNNN mammomem I I l | TITTT l1 0 l 46 I I N WII1 11111LIIL11|1111LI1 023212 4 6 8 IO I2 I4 $3,232 TRAINING DAYS 0% FREQUENCY SHIFTED-TIME RESTORED 47 post—tests. However, between the 35% frequency shifted-time restored pre-test and the 35% frequency shifted-time restored post-test, the mean item correct change in score is from 4.66 to 5.77, an item correct change of 1.11. Chapter IV DISCUSSION Many investigators have compared the effects of frequency and temporal manipulation on the speech signal using various intelligibility measures. In an attempt to overcome the variable of the unfamiliarity of the manipulated signal, cer- tain researchers have suggested the use of training sessions (Raymond and Proud, 1962: Johansson, 1966; Ling, 1968; Zemlin, 1967). The potential use of frequency shifting by the slow play method with time restoration using the Fairbanks electromechanical method has been suggested by the work of Zemlin (1967) and Bennett and Byers (1967). The use of fre- quency shifted-time restored manipulated signals with hearing impaired adults (Bennett and Byers, 1967) and children (Zemlin, 1967) has been shown to produce increases in intel- ligibility scores. Zemlin (1967) reported a 16% increase in intelligibility scores over four trials. The results found in this study agree with the data reported by Zemlin (1967) and Bennett and Byers (1967), that is, that training with frequency shifted-time restored signals improved discrimination scores. In the present investigation, mean correct percentage scores for subjects who received training with the 35% shifted- time restored signal improved 14.3% from the 35% frequency shifted-time restored pre-test to the 35% frequency shifted- time restored post-test, and 12.5% from the 0% frequency shifted-time restored post-test. Thus. subjects who received training on the 35% frequency shifted-time restored tapes showed general improvement in their mean correct percentage 48 49 scores. Subjects who received training on the 0% frequency shifted-time restored signal improved 4.6% from the 35% frequency shifted-time restored pre-test to the 35% frequency shifted-time restored post-test, and 1.3% from the 0% fre- quency shifted-time restored pre-test to the 0% frequency shifted-time restored post-test. While the group trained on 35% frequency shifted-time restored signals improved signi- ficantly from both the 0% frequency shifted—time restored and 35% frequency shifted-time restored pre-tests to the 0% fre— quency shifted-time restored and 35% frequency shifted-time restored post-tests, the group trained on the 0% frequency shifted-time restored signals improved only slightly. The data, then, suggested that training with the 35% frequency shifted-time restored tapes led to improvement on the post- tests for this group. Comparison of mean percent correct scores for both training groups between the 0% frequency shifted-time restored pre-test and the 0% frequency shifted-time restored post-test can be made. subjects who were trained on the 0% frequency shifted-time restored tapes showed a mean percentage correct increase of only 1.3%, however, the group trained on the 35% frequency shifted-time restored tapes showed a 12.5% mean correct increase. Thus, training on the 0% frequency shifted- time restored tapes did not appear to result in improved 0% frequency shifted-time restored post-test scores. Mean percentage correct scores between the 35% frequency shifted-time restored pre-test and the 35% frequency shifted- time restored post-tests showed that both the 35% frequency 50 shifted—time restored trained group and the 0% frequency shifted-time restored trained group improved between initial and final testing. The 35% frequency shifted-time restored trained group improved 14.3% from the 35% frequency shifted- time restored pre-test to the 35% frequency shifted-time restored post-test, while the 0% frequency shifted-time restored trained group improved only 4.6%. The 35% fre- quency shifted-time restored trained group demonstrated a greater improvement (14.3%) in mean percentage correct score on the 35% frequency shifted-time restored tests than was shown under any of the other testing conditions. The 4.6% increase in mean percentage correct score for the group trained under the 0% frequency shifted-time restored condition could be attributed to training or to the fact that the 35% frequency shifted-time restored pre-test score was the lowest percentage correct score for all conditions of pre-testing. Thus, results indicated that training under the 35% frequency shifted-time restored condition improved the mean percentage correct score significantly for both testing conditions. The scores for the group with training under the 0% frequency shifted-time restored condition did not demon- strate significant increases for either testing condition. It is interesting to note, however, that the greatest improve- ment in mean percentage correct score for the group who received the 0% frequency shifted-time restored training con- dition, was indicated on the 35% frequency shifted-time restored testing condition. This 4.6% improvement in mean percentage correct score could be due to an increase in 51 familiarity with the manipulated signal on the 35% frequency shifted-time restored post-test because of learning on the 35% frequency shifted-time restored pre-test. This improve- ment in mean percentage correct score could indicate that there is some "carryover" in learning from the 0% frequency shifted-time restored training. However, this is doubtful since only a 1.3% increase was shown between the 0% frequency shifted-time restored pre-test and the 0% frequency shifted- time restored post-test. The 12.5% increase in mean percen- tage correct score between the 0% frequency shifted-time restored pre-test and the 0% frequency shifted-time restored pre-test may indicate a "carryover" from learning on the 35% frequency shifted-time restored training tapes. Bennett and Byers (1967) suggested that frequency shifted signals should be easier to ”carryover" to unconverted signals than dispro- portionate frequency shifts, as found with frequency transposition. Figure 2 shows the mean item scores for all days of training from pre—tests to post-tests for the subjects who were trained under the 35% frequency shifted—time restored condition. The function for the subjects who received the 35% frequency shifted-time restored training, showed a sharp improvement the first six days of training and then appeared to plateau for the remaining nine days. In addition, both post-test scores were close to the average score on day five. Thus, it appeared that the mean effects of training reached maximum on day five, indicating that fifteen days of training may not have been necessary. Figure 3 shows the function for 52 the subjects who received 0% frequency shifted-time restored training was lower during the first three days of training than the 0% frequency shifted-time restored pre-test mean item score, and fluctuated at this level throughout training. The fact that the first three days produced lower mean per- centage correct scores than the pre-test scores, may indicate that the pre-test score on the 0% frequency shifted-time restored pre-test was unusually high. The mean effect of training appeared to have reached a maximum point on the third day of training for the 0% frequency shifted-time restored trained group. Again, it should be noted that fifteen days of training may have been excessive. Although mean percentage correct scores indicated that fifteen days of training may not have been necessary, some individual subjects did show steady increases in score through- out the testing. However, other subjects appeared to exhibit a "rollover” phenomena, whereby, after a peak had been reached, a decrease in mean item correct scores was observed. This decrease in mean item correct score could be attributed to boredom with the same stimuli, or could indicate that with continued training confusions were more apt to occur. In addition, confusions may have arisen as the child learned some of the words with which he had previously been unfamiliar. The acoustic information that the child's distorted hearing mechanism decoded between several stimulus items may have been perceived as being the same or highly similar. For example, a child who consistently recognized the word ”bee", may have been confused when required to discriminate between 53 a previously unknown stimulus item such as "beans” and a familiar item such as ”bee”. Several graphic representations of individual subjects' item correct scores were included to point out observable trends in the data. For the group trained under the 35% frequency shifted-time restored condition, two patterns were evident. The more prevalent pattern is seen in Figure 4. This subject's item correct scores increased daily to day five, declined, improved at least once more, and declined near the end of training. Two rises and declines in score were observed in most individual functions. Figure 5 exhibits the two rises, with no decline after the second rise. Only a few subjects continued to show consistent increases in score throughout the training. Two patterns were readily observable in the mean item correct scores of the group trained under the 0% frequency shifted-time restored condition. The more prevalent pattern is seen in Figure 6. This pattern shows consistent variance around a mean, while generally not showing a progressive im- provement. Figure 7 shows a consistent increase in item correct score throughout the training, and was typical of only a few individualsewas seen for the group who received training under the 35% frequency shifted-time restored condition. This data indicated that the 35% frequency shifted-time restored training significantly affected the scores on the post-tests. The 35% frequency lowering may have shifted more acoustic information, necessary for consonant discrimination, into the range of the subjects' residual hearing. Since most 54 FIGURE 4.--Individual function of item correct scores per day with 35% frequency shifted-time restored training. 55 V 'TTTET'T I ITEMS CORREC T I I j I l SUBJECT FP Lll j LIIJLIIIL I VII 2 4 G 8 IOIZ l4 2:3,,“32, TRAINING DAYS--35% FS-‘TR TRAINING 56 FIGURE 5. --Individual function of item correct scores per day with 35% frequency shifted- time restored training. ITEMS CORRECT as 26 24 22 20 IB IS I4 57 SUBJECT KS 411114 J llllllnlll 11 4 S 8 I0 l2 I4 TEE: 3:67:37. 2 TRAINING DAst-35‘Io FS‘TR TRAINING 58 FIGURE 6.--Individual function of item correct scores per day with 0% frequency shifted-time restored training. ITEMS CORRECT 59 28 - 26F- 24.- '3- SUBJECT MR IO” 45 65 m l AlLllllllllllllJJil 0.....2 4 S 8 IO l2 I4 .E‘Ez 07a 3570 TRAINING DAYS-'0‘Io FS "TR TRAINING 60 FIGURE 7.--Individual function of items correct scores per day with 0% frequency shifted-time restored training. ITEMS CORRECT 61 28 26 rFI 1 24 22 20 rrITI I8 I6 I I4 I2 IO ITETFI B- SUBJECT MM I rill'llllnlil4lnljlj o ... M 2 4 6 8 IO I2 l4 gate, 35". 07° TRAINING DAYS-“0’7. FS-TR TRAINING 62 consonantal energy concentrations were within the high fre— quency range, it is possible that some acoustic information which was available above 2000 Hz in undistorted speech, had been proportionately lowered by the slow play method of fre- quency manipulation. The proportionate frequency shift which maintained the relative relationship between formant energy regions was postulated by Zemlin to be responsible for the retention of intelligibility. If this additional acoustic information could be provided for hearing impaired subjects with low frequency residual hearing, the limits of their usable residual hearing for speech would be "eXpanded". Since the data reported in this study may also imply that a “carryover" phenomena may occur, incorporation of frequency shifted-time restored stimuli should be considered for auditory training programs. Implications for Future Research Since the data from this study has shown that 35% fre- quency shifted-time restored training can provide increases in mean percentage correct Scores dealing with intelligibility, further research in this area is indicated. After the frequency and temporal manipulations had been made on the 35% frequency shifted-time restored tapes, in certain instances, ”pie" and ”pipe" were perceived as ”kie" and ”kipe”. Since the perceived conversion of the initial /p/ to /k/ did not occur in all the recordings of the words, it was assumed to be a random result of the electromechanical time compression procedures. Further research is indicated to investigate whether these changes are random and, further, 63 whether there are any consistent patterns to be observed. The small item increases in score were probably due to the difficulty of the task which depended on consonant iden- tification. Easier auditory discrimination tasks could be provided by using spondaic words or sentences as stimulus items. Further, carefully controlled sentences would provide more acoustic information for the child to use when discri- minating (Hirsh, 1952). Although synthetic sentences would provide minimal contextual cues, daily use of such material could prove to be damaging to the child's language develop- ment. Sentences could be constructed so that they varied on one or several of the parameters used in auditory discri- mination. Any stimulus providing more acoustic cues would have to be very carefully controlled in future research. An improvement of speech quality of the test items was noticed, but not rated because of lack of an objective method of quantifying the data. Future research should be designed to investigate this phenomena in a more thoroughly controlled fashion. Although every effort was made to eliminate experimenter bias, the fact remains that this experimenter was very fami- liar with the subjects. Further research might consider the possibility of using an eXperimenter who was unfamiliar with the subjects. LIST OF REFERENCES LIST OF REFERENCES Beasley, D.S., Schwimmer, S., Rintelmann, W.F., Intelligibi- lity of Time Compressed CNC Monosyllables. J.Speech Hearinnges., June (1972) Beasley, D.S., Shriner, T.H. Auditory analysis of time varied sentential approximations. Journal of Audiology, 12, 262-271 (1973) Beil, P., Frequency analysis of vowels produced in helium-rich atmosphere. J.Acoust. Soc. Amer., 34, 347-349 (1967) Bennett, D., and Byers, V., Increased intelligibility in the hypacusic by slow-play frequency transposition. Bogert, B., The vobanc-—a two to one speech band-width reduc- tion system. J. Acoust, Soc. Am., 28, 399-404 (1956) Cooke, J., and Beard, 8., Speech intelligibility for space vehicles using nitrogen or helium as the inert gas. J. Acoust. Soc. Amer., 40, 1450-1453 (1966) Corliss, E., Burnett, E., Kobal, M. and Bassin, M., The relative importance of frequency distortion and changes in time constants in the intelligibility of speech. IEEE Transaction on Audio-Electro., AU-l5 36-39 (1968) Daniloff, R., Shriner, R., and Zemlin, W. The intelligibility of vowels altered in duration and frequency. J. Acoust. Soc. Amer., 44, 700-707 (1968) David, E., Naturalness and distortion in speech processing devices. J. Acoust. Soc. Amer., 28, 586-589 (1956) Fairbanks, G., Everitt, W., and Jaeger, R., Methods for time or frequency compression-expansion of speech. Trans. IeReEa-PeGeAe, ALI-2, 7’12 (1951+) Fairbanks, G., Guttman, N., and Miron, M., The effects of time compression on the auditory comprehension of spoken messages. Unpub. Res. Rep., Contract No. AF 18 (600) - 1059 (1956) 64 65 Fairbanks, G. and Kodman, F., Word intelligibility as a func- tion of time compression. J.Acoust. Soc. Amer., 229, 636-641 (1957) Foulke, D., Comparison of comprehension of two forms of fomggessed speech. J.Except. Child., 33, 169-173, 19 ) Garvey, W., The intelligibility of speeded speech, J. Exp. Psychol., 45, 102-108 (1963) Gold, B., Techniques for speech bandwidth compression, using combinations of channel vocoders and formant voco- ders. J.Acoust. Soc. Amer., 38, 2-10 (1965) Gold, B., and Radar, C., The channel vocoder. IEEE Trans- actions of Audio and Electro., AU-l5, 148—160 (1967) Golden, M., Digital computer simulation of a sampled-data voice excited vocoder. J. Acoust. Soc. Amer., 35, 1358-1367 (1963) Goldstein, Max A., An acoustic method. Volta Review, 22, 716-719 (1920) Guttman, N. and Nelson, J., An instrument that creates some artificial speech spectra for the severely hard of hearing. Amer. Ann. of the Deaf., 113, 295-302, (1968) Hollywell, N. and Harvey, Helium s eech. J. Acoust. Soc. Amer., 36, 210-211 (1964) Hudgins, C.V., Auditory training--its possibilities and limi- tations, Volta Review, 56, 339-349 (1954) Johansson, B., The use of the transposer for the management of the deaf child. Intern. Aud., 5, 363-372 (1966) Kryter, K., Speech bandwidth compression through spectrum selection. J. Acoust. Soc. Amer., 32, 5 7-556 (1960) Kurtzrock, G., The effects of time and frequency distortion upon word intelligibility. Ph.D. Dissert., Champaign: Univ. of Ill. (1957) Also, Speech Mai. 24. 94 (1957) Ling, D., Three experiments on frequency transposition. Amer. Ann. of the Deaf, 113, 283-294 (1968) 66 Ling, D., and Druz, W., Transposition of high frequency sounds by partial vocoding of the speech spectrum: Its use by deaf children. J. Aud. Res., 7, 133-144 (1967) Ling, D., and Doehring, D.G., Learning limits of deaf children forécoded speech. J. Speech Hearing Res., 12, 83-94 19 9) McLean, D., Analysis of speech in a helium-oxygen mixture under pressure. J. Acoust. Soc. Amer., 40, 625-627 (1966) Miller, G. and Licklider, J., The intelligibility of inter- rupted speech. J. Acoust. Soc. Amer., 22, 167-173 (1950) Oyer, H.J., Auditory Communication for the Hard of Hearing, PrentiCe-Hall, Inc., Englewood Cliffs, N.J. (1966) Peterson, G.E. and Barney, H.L., Control methods used in a study of the vowels. J. Acoust. Soc. Amer., 24, 175-184 (1952) Pimenow, L., LR application de la parole syntheslique dans la correction auditive. Acustica. 12, 285-290 E1922; English translation. J. Aud. Res., 3, 73-82 19 3 Raymond, T., and Proud, G., Audiofrequency conversion. Arch. of 0to-Laryngol., 76, 436-446 (1962) Sanderson, M.E. and Rintelmann, W.F., Articulation functions and test-retest performance of normal hearing child- ren on three speech discrimination tests. Paper presented at the annual convention of the American Speech and Hearing Association on November 17, 1971 in Chicago, Illinois. Schroeder, M., Vocoders: analysis and synthesis of speech. Proc. IEEE, AU-54, 720-734 (1966) Shriner, R., Beasley, D., Zemlin, W., Effects of frequency divided speech signals on identification accuracy and reaction time measures. J. Speech Hearing Res., 12, 413-421 (1969) . Stover, W., Time-domain bandwidth-compression system. Q; Acoust. Soc. Amer., 42, 348-359 (1967) Takefuta, Y., and Swigart, E., Intelligibility of speech sig- nals spectrally compressed by a sampling-synthesi- zing technique. IEEE Transactions on Audio-Electro. AU-lS. 271-274 (1968) 67 Vilbig, F. and Haase, K., Some systems of speech band compres- sion. J. Acoust. Soc. Amer., 28, 573-577 (1956) Voiers, R., The present state of digital vocoding technique: A diagnostic evaluation. IEEE Trans. of Audio- Electro., AU-lS, 275-279 (1968) Williams, C., and Hecker, M., Relation between intelligibility scores for test methods and three t es of speech distortion. J. Acoust. Soc. Amer., 4, 1002-1006, (1968) Zemlin, W., The use of bandwidth and time compression for the hard of hearing handicapped. Proc. Louisville Conf. Time Compressed Speech. E. Foulke, ed.Louisville: University of Louisville (1967) APPENDICES APPENDIX A INFORMATION FORM APPENDIX A Subject Date Name Birthdate Age Sex Number of years in school (including current year)__ Check one: Self-contained special classroom Team teaching special classroom Integrated team teaching classroom Educational Achievement: Reading Book Page Math Book Page Rating of Speech Skills: Check the appropriate skill and qualify using Key. Add addi- tional comments if necessary. Key: NI-not intelligible IE-intelligible to listener familiar with the child IA-intelligible to anyone No spontaneous vocalization Little spontaneous vocalization Spontaneous words Spontaneous phrases Spontaneous sentences Spontaneous connected language Rating of Auditory Skills: Circle one. Primary avenue of learning: Auditory Visual Intelligence or Achievement Testing: Name of tests, date and scores. 68 69 Subject Auditory Skills: Please rate all items the child can do auditorily. Key: C=consistently I=inconsistently Aural Discrimination g; Voiced §timulus Varying Length Similar Length number of items rating number of items rating sentences sentences phrases phrases F words words: I (number of syllables) 2 syllable 1 syllable 1 (non-rhyming) f l syllable (rhyming) Expanding aural receptive language: Amplification: Type of Hearing Aid monaural, binaural Date child received first hearing aid Number of years using amplification Does the child wear amplification at school? What kind? Number of hours per day Does the child wear amplification at home? Number of hours per day: weekdays weekends lHas the child had his hearing aid consistently since he first got it? Iiave their been any long periods (a week or more) when the <3hild's aid was in for repairs? Explain ¥ ¥ APPENDIX B INFORMATION USED TO MATCH SUBJECTS CONTROL GROUP EXPERIMENTAL GROUP 7O U .Hmm MIH .mp3 mAIINzomN. o .ucmm mH em .a HH .n mm > m .mund muH .mp3 .dm mIm m N .xm cm>wm as .m xmameoo omuuwzomm. .mceH Hmus< 4H.ocmH mHm.a Hm .d ooa ooa > mHOm mcwncmmxm .uomccoo.am «um m m .xm ce>mm mx .m cuecmH mH .3mm HmH.a me .a mOH mm > when .HHsHm.ucmm .ms3.dm mun m H .xm room mm .A Hz SSH.a ee.a OOH cm > e H mmmmuca mus.coam oHIs - H.xm .oHHmm :H .o Hz use» .NHHmoo> mna.a Ha .m moH OOH > - ImHmcoo mac: .am mHUUHH mum e H .xm cw>mm OH .m manuumomN. o meuuummNH. cumcmH.HHsHm Hz usa.am mm~.c H .a OHH mm > m .mmuca.me3 mH ms3.am HHIm m H.xm cm>mm an .e HHssHum meQEou.mcmq .oem Heus< mH mnH.d em .d mm > m .ucmm..muna .muna.am Sue e H xm .UHHm: as .m mH «unavam .cmH.:coo meano 3 m Iconoumum me: .am va e H .Hm cm>em mm .m chInmoHH. «H.ucmH mo~.d mmH.c oHH om > m .ceH.axm House .ccoo mus m H .Hm H am: am .m nuucmH . HHEHN UCQQ N BuoceH «H a HH HmH.d m- .d OHH cm > e mcHHue> as: m me: am qua e H .Hm Hose a: .a HH mmH.d mm .d mOH ooH oOH > .mcoocH H .ueom as: am ans m H .xm H an: a: .o ounce Ho obscene mH HmH.d HH .a OHH 00H om > .mcoocH Imocmmoua muse am mum m H .Hm cm>mm can .m mmINmoHN. H numceH mH HNH .d H.a moa mm > m Hmsvwcz.ucwm ucmm am VIn m H .xm cm>mm m4 .v H numcwa mcwmum> mH .mucm mnH.Q we .m oauuomw. oaIummH. > m .eunm .ucmm .moz.am 0-5 s H .xm .oHHm: mmH .m H numnmd wcwahm> mH .munm mm mm > m sung .ucmm ..ms3 am a-» - maH.a em .a H .xm .oHHmm m: .~ mH musuowm neuomccoo onQEoo 3 dH one mv .m mm om ma 4 o .eHmd buocm .ucem.muna MIm m H .xm H an: 2: .H a: ooow a: ooOH a: com mchuemH mm: cH< HuouHes< commam woe Hoocom xoom xoom uownnsm uo moo: mcHHmw: CH sum: vcwcmem Adam mvv mOHonmwunB 0:09 Guam whmEHHm no name» macaw mDOmU AOmBZOUIImBUmhmDm $0942 08 QmmD ZOHB