.. e i..." 2 avg?“ 7 I 3. x... I... < -=., 1‘ I :!.H;!K I’v1uflf. ‘ 1:. .. .« .. t 3...!) n2. 13‘ . . A... u :0) L 3 8-.er .. .4 . V. c u t ’ w .0} .i v _ 2 $93.33. a #6:? , Am . . z . 1!- é. E. 6...: . :53: 5..... 4.. 40.. :x-! 31. .2. In)... . I .. 5.2.112 . :I. .51.“... J22?! ’ n :2. :v I .1 a 01...! "NH“. r... _., ,. ,%\§5M¢ma .mfi nmmvwfifi, . This is to certify that the dissertation entitled THE USE OF INTERAURAL PARAMETERS DURING INCOHERENCE DETECTION IN REPRODUCIBLE NOISE presented by MATTHEW JOSEPH GOUPELL has been accepted towards fulfillment of the requirements for the Doctoral degree in Physics and Astronomy MMAA; / Major Professor’s Signature ’flxamk 269%)“— Date MSU is an Affirmative Action/Equal Opportunity Institution .7 _________, ~—— -____— LlBRARY T Mict‘iigan state Universly PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 2/05 p:/ClRC/DateDue.indd-p.1 THE USE OF INTERAURAL PARAMETERS DURING INCOHERENCE DETECTION IN REPRODUCIBLE NOISE By Matthew Joseph Goupell A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Physics and Astronomy 2005 ABSTRACT THE USE OF INTERAURAL PARAMETERS DURING INCOHEREN CE DETECTION IN REPRODUCIBLE NOISE By Matthew Joseph Goupell Interaural incoherence is a measure of the dissimilarity of the signals in the left and right ears. It is important in a number of acoustical phenomenon such as a listener’s sensation envelopment and apparent source width in room acoustics, speech intelligibility, and binaural release from energetic masking. Humans are incredibly sensitive to the difference between perfectly coherent and slightly incoherent signals, however the nature of this sensitivity is not well understood. The purpose of this dissertation is to understand what parameters are important to incoherence detection. Incoherence is perceived to have time-varying characteristics. It is conjectured that incoherence detection is performed by a process that takes this time dependency into account. Left-ear—right-ear noise-pairs were generated, all with a fixed value of interaural coherence, 0.9922. The noises had a center frequency of 500 Hz, a bandwidth of 14 Hz, and a duration of 500 ms. Listeners were required to discriminate between these slightly incoherent noises and diotic noises, with a coherence of 1.0. It was found that the value of interaural incoherence itself was an inadequate predictor of discrimination. Instead, incoherence was much more readily detected for those noise-pairs with the largest fluctuations in interaural phase and level differences (as measured by the standard deviation). N oise—pairs with the same value of coherence, and geometric mean frequency of 500 Hz were also generated for bandwidths of 108 Hz and 2394 Hz. It was found that for increasing bandwidth, fluctuations in interaural differences varied less between different noise—pairs and that detection performance varied less as well. The results suggest that incoherence detection is based on the size and the speed of interaural fluctuations and that the value of coherence itself predicts performance only in the wide-band limit where different particular noises with the same incoherence have similar fluctuations. Noise-pairs with short durations of 100, 50, and 25 ms, and bandwidth of 14 Hz, and a coherence of 0.9922 were used to test if a short-term incoherence function is used in incoherence detection. It was found that listeners could significantly use fluc- tuations of phase and level to detect incoherence for all three of these short durations. Therefore, a short-term coherence function is not used to detect incoherence. For the smallest duration of 25 ms, listeners’ detection cue sometimes changed from a “width” cue to a lateralization cue. Modeling of the data was performed. Ten different binaural models were tested against detection data for 14—Hz and 108-Hz bandwidths. These models included dif- ferent types of binaural processing: independent interaural phase and level differences, lateral position, and short-term cross-correlation. Several preprocessing features were incorporated in the models: compression, temporal averaging, and envelope weight- ing. For the 14-Hz bandwidth data, the most successful model assumed independent centers for interaural phase and interaural level processing, and this model correlated with detectability at r = 0.87. That model also described the data best when it was assumed that interaural phase fluctuations and interaural level fluctuations con- tribute approximately equally to incoherence detection. For the 108-Hz bandwidth data, detection performance varied much less among different waveforms, and the data were less able to distinguish between models. This work is dedicated to my amazing wife Sarah and loving family. iv ACKNOWLEDGEMENTS Along my long road of schooling, I have had many mentors. I might have become a biologist, chemist, or mathematician if it had not been for my high school physics teacher. Jim Gormley opened my eyes to the world of physics and I am truly grateful to him for that. Most memorable of my mentors were my professors at Hope College, who allowed my mind to open to the love of learning. They helped me stay on track with my studies, like Dr. Paul DeYoung. They gave me encouragement when I was down, like Dr. Graham Peaslee. They gave me higher goals to strive for, like Dr. Cathy Mader. They were good friends, like Dr. Tim Pennings. At Michigan State University, I have received as much support as I could hope for during graduate school. My advisor, Dr. William Hartmann, has been very influential in my personal work ethic, my scholarly preparation, and my newly-acquired love for the field of psychoacoustics. If nothing else, I think he and I were the perfect match for advisor and student. I would also like to acknowledge Dr. Steve Colburn, Dr. Nat Durlach, and Dr. Brad Rakerd for providing invaluable insights that I used to complete this work. Dr. Hayder Radha and Dr. Karim Oweiss also deserve mention for our discussions on stochastic systems and signal detection theory, topics that I have found very useful in my dissertation. TABLE OF CONTENTS LIST OF TABLES ....................................................................................... viii LIST OF FIGURES ..................................................................................... x KEY TO SYMBOLS .................................................................................... xiii KEY TO ABBREVIATIONS ....................................................................... xvi INTRODUCTION ........................................................................................ 1 1. INCOHERENCE AND FLUCTUATIONS OF INTERAURAL PARAMETERS 1.1 Experiment 1: Narrow Bandwidth .................................................... 13 1.2 Experiment 2: Critical Bandwidth .................................................... 35 1.3 Experiment 3: Wide Bandwidth ....................................................... 44 1.4 Experiment 4: The Role of Bandwidth ............................................. 53 1.5 Experiment 5: The Effect of Monaural Cues .................................... 63 1.6 Discussion and Conclusion ................................................................ 68 2. FLUCTUATIONS WITH VARIED COHERENCE 2.1 Experiment 6: Narrow Bandwidth .................................................... 76 2.2 Experiment 7A: Critical Bandwidth ................................................. 81 2.3 Experiment 7B: Critical Bandwidth - Temporal Averaging ............. 88 2.4 Experiment 7C: Critical Bandwidth — TI Trading ............................ 92 2.5 Experiment 8: Wide Bandwidth ....................................................... 95 2.6 Experiment 9: The Role of Bandwidth ............................................. 98 2.7 Experiment 10: The Effect of Monaural Cues .................................. 106 2.8 Experiment 11: Contextual Effects ................................................... 109 2.9 Conclusion ......................................................................................... 115 3. EFFECTS OF DURATION ON INCOHERENCE DETECTION 3.1 Experiment 12: Narrow Bandwidth......, ........................................... 116 vi 4. BINAURAL MODELING: FIXED COHERENCE 4.1 Experiment 13: Narrow Bandwidth .................................................. 134 4.2 Models for Incoherence Detection ..................................................... 136 4.3 Experiment 14: Critical Bandwidth .................................................. 158 4.4 Multiple Parameter Model ................................................................ 172 4.5 Discussion and Conclusion ................................................................ 173 5. BINAURAL MODELING: VARIED COHERENCE 5.1 Experiment 15: Narrow Bandwidth .................................................. 177 5.2 Models for Incoherence Detection ..................................................... 179 5.3 Experiment 16: Critical Bandwidth .................................................. 190 5.4 Conclusion ......................................................................................... 208 6. PROBING MODELS WITH SELECTED NOISE—PAIRS 6.1 Experiment 17: Minimum Fluctuation Sets ...................................... 209 6.2 Experiment 18: Addition and Cancellation Sets ............................... 220 7. CONCLUSIONS ..................................................................................... 225 APPENDICES ............................................................................................ 229 BIBLIOGRAPHY ....................................................................................... 257 vii LIST OF TABLES Chapter 1 Table 1: Signal structure percentages ............................................................ 19 Table 2: 14—Hz p-values ................................................................................. 26 Table 3: 14-Hz listener correlations, phase set .............................................. 31 Table 4: 14-Hz listener correlations, level set ................................................ 32 Table 5: 108-Hz p-values ............................................................................... 42 Table 6: 108-Hz listener correlations, phase set ............................................ 42 Table 7: 108-Hz listener correlations, level set .............................................. 43 Table 8: 2394-Hz p—values ............................................................................. 50 Table 9: 2394—Hz listener correlations, phase set .......................................... 50 Table 10: 2394-Hz listener correlations, level set .......................................... 51 Table 11: Matched p-values .......................................................................... 55 Table 12: Matched listener correlations, phase set ....................................... 60 Table 13: Matched listener correlations, level set ......................................... 61 Chapter 2 Table 14: l4-Hz p-values .............................................................................. 80 Table 15: 14-Hz fixed vs. varied coherence .................................................. 80 Table 16: 136-Hz p-values ............................................................................ 86 Table 17: 136-Hz fixed vs. varied coherence ................................................ 87 Table 18: 136-Hz p-values — temporal averaging ......................................... 91 Table 19: 136-Hz p-values — TI trading ....................................................... 94 Table 20: 2394-Hz p—values .......................................................................... 97 Table 21: 2394-Hz fixed vs. varied coherence .............................................. 97 Table 22: Matched p-values ......................................................................... 100 Table 23: Matched fixed vs. varied coherence ............................................. 105 Chapter 3 Table 24: Moments and correlation for short-duration noise-pairs .............. 121 Table 25: Short-duration p-values ................................................................ 128 viii Chapter Table 26: Table 27: Table 28: Table 29: Chapter Table 30: Table 31: Table 32: Table 33: Table 34: Table 35: Table 36: Table 37: Chapter Table 38: Table 39: Table 40: 4 14—Hz free parameter values PC ..................................................... 152 14-Hz free parameter values CAS ................................................. 153 108-Hz free parameter values PC ................................................... 165 108-Hz free parameter values CAS ................................................ 166 5 14-Hz free parameter values Pc ..................................................... 185 14-Hz free parameter values CAS ................................................. 186 136-Hz free parameter values PC — independent center ................. 199 136-Hz free parameter values GAS — independent center ............. 200 136-Hz free parameter values PC — lateral position ....................... 203 136—Hz free parameter values CAS' - lateral position ................... 204 136-Hz free parameter values PC — STCC ..................................... 205 136-Hz free parameter values CA3 — STCC ................................. 206 6 Minimum level fluctuation set ....................................................... 213 Minimum phase fluctuation set ..................................................... 214 Effectiveness of preprocessing ....................................................... 218 ix LIST OF FIGURES Introduction Figure 1: Jeffress coincidence matrix ............................................................. Figure 2: Incoherence perception ................................................................... Chapter Figure 3: Figure 4: Figure 5: Figure 6: Figure 7: Figure 8: Figure 9: Figure 10: Figure 11: Figure 12: Figure 13: Figure 14: Figure 15: Figure 16: Figure 17: Figure 18: Figure 19: Figure 20: Figure 21: Figure 22: Figure 23: Figure 24: Figure 25: Figure 26: Figure 27 : Figure 28: 1 Frequency shaping ......................................................................... Temporal shaping .......................................................................... 14-Hz interaural parameters vs. time ............................................ 14-Hz st[A] vs. st[AL] ................................................................ Three-interval sequence ................................................................. Response box ................................................................................. 14 Hz, PC, Phase set .................................................................... 14 Hz, CAS, Phase set ................................................................ 14 Hz, PC, Level set ..................................................................... 14 Hz, CAS, Level set ................................................................. Signals ......................................................................................... 108-Hz interaural parameters vs. time ........................................ 108-Hz st[A] vs. st[AL] ............................................................ 108 Hz, PC, Phase set .................................................................. 108 Hz, GAS, Phase set .............................................................. 108 Hz, PC, Level set ................................................................... 108 Hz, CAS, Level set ............................................................... 2394-Hz interaural parameters vs. time ...................................... 2394-Hz st[A] vs. st[AL] .......................................................... 2394 Hz, PC, Phase set ................................................................ 2394 Hz, CAS, Phase set ............................................................ 2394 Hz, PC, Level set ................................................................. 2394 Hz, CA5, Level set ............................................................. Matched st[A] vs. MAL] ......................................................... PC, Matched phase set ................................................................ CA5, Matched phase set ............................................................ 6 7 15 17 19 21 23 24 27 28 29 30 34 36 37 38 39 40 41 44 45 46 47 48 49 53 56 57 Figure 29 Figure 30 Figure 31 Figure 32 Chapter Figure 33: Figure 34: Figure 35: Figure 36: Figure 37: Figure 38: Figure 39: Figure 40: Figure 41: Figure 42: Figure 43: Figure 44: Figure 45: Figure 46: Figure 47: Figure 48: Figure 49: Figure 50: Figure 51: Chapter Figure 52: Figure 53: Figure 54: Figure 55: Figure 56: Figure 57: Figure 58: : PC, Matched level set .................................................................. : CAS, Matched level set .............................................................. : Pc vs. P, for phase set ................................................................ : PC vs. P, for level set .................................................................. 2 14-Hz coherence histogram ......................................................... 14-Hz st[A] vs. s¢[AL] ............................................................. 136 Hz coherence histogram ....................................................... 136-Hz interaural parameters vs. time ....................................... 136-Hz st[A] vs. st[AL] ........................................................... 136-Hz st[A] vs. st[AL] - collection 1 ...................................... 136-Hz st[A] vs. MAL] — collection 2 ...................................... 136-Hz interaural parameters vs. time — temporal averaging ..... 136-Hz interaural parameters vs. time — TI trading ................... 636-Hz st[A] vs. st[AL] ............................................................ Matched st[A] vs. st[AL] ......................................................... PC, Matched phase set ................................................................ CAS, Matched phase set ............................................................ PC, Matched level set .................................................................. CAS, Matched level set .............................................................. PC, Contextual effects phase set .................................................. CAS, Contextual effects phase set .............................................. CAS, Contextual effects level set ................................................ 3 Spectral splatter, Phase set ........................................................ Spectral splatter, Level set ......................................................... PC, Phase Set .............................................................................. CAS, Phase Set .......................................................................... PC, Level Set ............................................................................... CAS, Level Set ........................................................................... 25-ms interaural parameters vs. time ......................................... xi PC, Contextual effects level set .................................................... 58 59 64 65 78 79 82 82 84 85 85 90 93 96 99 101 102 103 104 111 112 113 114 118 119 124 125 126 127 130 Chapter 4 Figure 59: Figure 60: Figure 61: Figure 62: Figure 63: Figure 64: Figure 65: Figure 66: Figure 67: Figure 68: Figure 69: Figure 70: Figure 71: Chapter 5 Figure 72: Figure 73: Figure 74: Figure 75: Figure 76: Figure 77: Figure 78: Figure 79: Figure 80: Figure 81: Figure 82: Figure 83: Figure 84: Figure 85: Chapter 6 Figure 86: Figure 87: Figure 88: Figure 89: Figure 90: Figure 91: 14 Hz, CAS and Nc scores ............................................................ 135 14 Hz, Model results, PC ............................................................... 147 14 Hz, Model results, CAS ............................................................ 148 14 Hz, Model comparison, PC ....................................................... 149 14 Hz, Model comparison, CA3 .................................................... 150 14 Hz, Model 1 free parameters .................................................... 155 108 Hz, CAS and NC scores .......................................................... 159 108 Hz, Model results, PC .............................................................. 160 108 Hz, Model results, CAS .......................................................... 161 108 Hz, Model comparison, PC ...................................................... 162 108 Hz, Model comparison, CAS .................................................. 163 108 Hz, Model 1 free parameters .................................................. 168 108 Hz, Model 4 free parameters .................................................. 169 14 Hz, CAS and Nc scores ............................................................ 178 14 Hz, Model results, PC“... .......................................................... 180 14 Hz, Model results, CAS ............................................................ 181 14 Hz, Model comparison, PC ....................................................... 182 14 Hz, Model comparison, CA5 .................................................... 183 14 Hz, Model 1 free parameters .................................................... 188 136-Hz st[A] vs. st[AL], 60 noise-pairs ...................................... 192 136 Hz, CAS and NC scores .......................................................... 193 136 Hz, Model results, PC .............................................................. 195 136 Hz, Model results, CAS .......................................................... 196 136 Hz, Model comparison, PC ...................................................... 197 136 Hz, Model comparison, CAS .................................................. 198 136 Hz, Model 1 free parameters .................................................. 201 136 Hz, Model 4 free parameters .................................................. 202 S¢[‘I’A¢] VS. 3t[‘I’AL] ...................................................................... 212 Detection data for Listener DA ................................................... 215 Detection data for Listener E ...................................................... 216 Detection data for Listener M ..................................................... 217 Example of Addition and Cancellation Noises ............................ 221 Addition and Cancellation Set Data ........................................... 222 xii KEY TO SYMBOLS Introduction 7 : Interaural cross-correlation function 5 : Lag of the cross-correlation function :rL : Signal to left ear xR : Signal to right ear T : Duration of signal or period Chapter 1 Experiment 1 :1: A : Randomly generated noise A $3 : Randomly generated noise B N : Number of components Cn : Amplitude of Fourier component can : Angular frequency of Fourier component (1),, : Phase of Fourier component fn : Frequency of Fourier component x3 : The new noise B after orthogonalization to noise A M : Number of samples in a channel Arms : Root-mean-square value of noise A Buns : Root-mean-square value of noise B p.413 : Correlation between noise A and B a : Mixing factor p : Coherence evaluated at time zero s(t) : Temporal window :2: Analytic signal 3? : Real part of analytic signal 3 : Imaginary part of analytic signal xiii ¢(t) : Phase of analytic signal E(t) : Envelope of analytic signal A(t) : Interaural phase difference AL(t) : Interaural level difference st[A] : Standard deviation over time of the interaural phase difference st[AL] : Standard deviation over time of the interaural level difference 0 E : Mean interaural phase difference, averaged over time o E : Mean interaural level difference, averaged over time Experiment 5 o s,[E] : Standard deviation over time of envelope 0 E : Mean of envelope, averaged over time Chapter 2 Experiment 7 o é[ f (t)] : Exponential averaging function 0 1' : Exponential averaging time constant 0 TD : Duration of exponential averaging window 0 z(t) : Auditory image position variable 0 c : Time/ intensity trading ratio 0 pMR : Correlation of detection data between matched and random noise-pairs Chapter 4 Experiment 13 o ‘IPM, : Compressed interaural phase difference 0 ‘II°AL : Compressed interaural level difference wg(t) : Envelope weighting function g : Envelope weighting threshold \IIM.: Transformed interaural phase difference \IIAL: Transformed interaural level difference xiv a : Weighting parameter for independent centers models 0 dn : Model number 0 W[h, \II(t)] : Threshold function 0 b : Weighting parameter for lateral position models o ‘11, : Transformed lateral position variable 0 cyst(t) : Short-term cross-correlation o ‘IICC : Transformed short-term cross-correlation o st[\Il’] : Standard deviation over time of the derivative of the transformed inter- aural difference . st[E’] : Standard deviation over time of the derivative of the envelope Appendix A o Ncorr : Number of correct responses 0 conf : Number of confident responses XV KEY TO ABBREVIATIONS 2AFC : Two-alternative forced-choice ASW : Apparent source width AVCN : Anteroventral cochlear nucleus CAS’ : Confidence adjusted score EC : Equalization-cancellation EE : Excitatory-excitatory EI : Excitatory-inhibitory DCN : Dorsal cochlear nucleus IACC : Interaural cross-correlation IC : Inferior colliculus ILD : Interaural level difference IPD : Interaural phase difference ITD : Interaural time difference jnd : Just-noticeable difference LEV : Listener envelopment LL : Lateral lemniscus LP : Lateral position units LSO : Lateral superior olive No : Noise in-phase NoS1r : Noise in-phase, signal out-of-phase N7r : Noise out-of-phase MLD : Masking level difference MGB : Medial geniculate body MSO : Medial superior olive MTB : Medial nucleus of the trapezoidal body NLL : Nucleus of the lateral lemniscus xvi P, : Percent correct P, : Percent selected pdf : Probability density function PVCN : Posteroventral cochlear nucleus RMS : Root-mean-square STCC : Short-term cross-correlation YN : Yes-no xvii INTRODUCTION 0.1 Road map to this work This dissertation plans to study the ability of human listeners to detect interaural incoherence, or the difference in the signals presented to the left and right ears. Lis- teners are remarkably good at distinguishing slightly incoherent noise from coherent noise, when the signals in the two ears are identical. The cue used to detect in- teraural incoherence is thought to be much like the cue used to detect masking-level differences, where an out-of-phase tone between the two ears is more easily detectable than an in-phase tone in otherwise coherent noise. The reason for this is that the out-of-phase tone adds a small amount of incoherence to the coherent noise. There has been a vast amount of research done on masking—level differences in the last 60 years. Incoherence detection is thought to be described wholly by the coherence function, which is an overlap integral over the left and right signals over the duration of the stimulus. If this is the case, dynamic features in the incoherent noise will be lost. However, the perception of incoherent noise has dynamic characteristics. Therefore, it seems appropriate to ask the following two questions: 1) Is cross-correlation or coherence function an adequate parameter for incoherence detection? 2) If not cross- correlation or coherence function, then what parameter is important to incoherence detection? These questions will be addressed by selecting noise—pairs that will explore the effect of bandwidth on incoherence detection. Chapter 1 will use noise-pairs with a fixed value of coherence. The results of this chapter will find that the noise-pairs with the largest standard deviation over time (fluctuations) of the interaural phase difference and interaural level difference will have incoherence that is much more readily detectable than noise-pairs with the smallest fluctuations. Since the coherence function predicts that incoherence detection should be the same in all these noises, it will be concluded that the coherence function is not used in incoherence detection. As the bandwidth increases, the rate of fluctuations also increases and the difference between noise-pairs with the largest fluctuations and smallest fluctuations in the interaural parameters is less perceptible to listeners. In the case of wide bands, it would be possible to use the coherence function to describe the detection data, but only because of the ergodicity of the fluctuations for wide bands. Chapter 2 will use noise-pairs with varied values of coherence, but similar distri- butions of fluctuations in phase and level differences. The values of coherence ranged from 0.969—998. For the average listener, incoherence would be very easy to detect in a noise-pair with a value of coherence of 0.969. For the average listener, incoherence would be almost impossible to detect in a noise-pair with a value of coherence of 0.998. There are no new results in Chapter 2, and its value is that the results of Chapter 1 are reproduced by using only the fluctuations in phase and level. The binaural integration time is tens to hundreds of milliseconds. Therefore there may be a fundamental flaw to the results of Chapters 1 and 2, which had 500—ms noises. It may be the case that incoherence detection is done in small analysis windows much shorter than 500 ms. Chapter 3 will address this criticism by using the same method of selecting noise-pairs with large and small fluctuations as was done in Chapters 1 and 2. However, the noises will now be five to twenty times shorter in duration and only the narrow bandwidth will be used. The results of this chapter will find that listeners will be able to use interaural fluctuations of phase and level for stimuli as short as 50 ms. The results will be comparable to those in Chapters 1 and 2. For 25-ms noise-pairs, listeners will use a lateralization cue instead of a width cue to perform the task. Coherent noise is normally described as a compact sound image in the center of the head. Incoherent noise is normally described as a wide or diffuse sound image in the center of the head. A lateralization cue, for this experiment, would be a compact sound that has moved from the center of the head, usually associated with a static interaural phase difference or static interaural level difference. It was necessary for listeners to use lateralization because it was found that the interaural differences did not change much over the 25—ms duration of the noises. After these preliminary experiments, the results from Chapters 1-3 will motivate the direction of the modeling of the incoherence detection data. Chapter 4 will concentrate on deve10ping and refining models that describe incoherence detection from a reference correlation of unity. The models will use different combinations of phase and level statistics and several aspects of the models will be similar to those used in describing masking-level difference data, since so much work has been done in this area. As was done in Chapter 1, Chapter 4 will use noise-pairs with a fixed value of coherence. Three types of models will be explored. The first type of model will add fluctuations of phase and fluctuations of level independently. The second type of model will combine phase and level differences and then calculate a statistic for fluctuations. In this model type, it would be possible for a phase difference to the right ear to be cancelled by a level difference to the left ear. This cancellation has been shown to occur in time-intensity trading experiments with static interaural differences. The last type of model reduces the coherence function to a short-term coherence function, that evaluates the coherence over very small time intervals. It will be shown that the incoherence detection data can be best described by a model that independently adds the standard deviation of phase to the standard deviation in level with equal weight for phase and level. Compression of the interaural difference scale, temporal averaging, and envelope weighting will also be applied to improve the model. As was done in Chapter 2, Chapter 5 will use noise-pairs with varied values of coherence. Again, there are no new results in Chapter 5, and its value is that the results of Chapter 4 are reproduced. Lastly, Chapter 6 will apply the modeling results of Chapters 4 and 5 to directly answer these two questions: 1) Can listeners use just phase fluctuations or just level fluctuations to detect incoherence? and 2) Do listeners use independent IPDs and ILDs as a function of time and not a single auditory image made of the combination of the two interaural parameters? It will be found that the answer to both of these questions is “yes”. However, not all of the variation of the data will be explained, motivating more studies. From the experiments in these six chapters of research, it will be concluded that incoherence detection is done by detection fluctuations in the interaural phase and level differences, contrary to other recent results. Before this dissertation, van de Par and Kohlrausch (1999) found no use for interaural parameters in incoherence detection. The reason for the discrepancy is that van de Par and Kohlrausch did not find a bandwidth dependence for interaural parameters, whereas there was one found in this dissertation. Another reason that the coherence function has been used to describe incoherence detection is possibly due to the stimuli that have been used in previous experiments. Critical bandwidth noises of about 100 Hz (for a center frequency of 500 Hz) and wide bandwidth noises of thousands of Hz have been used for experiments where individual features in the interaural parameters are shown to be too fast to consistently recognize. Two types of models that are often used to describe masking-level difference data are the equalization-cancellation model and correlation model. However, from the results that will appear in the following chapters, both of these models should only be applied to stimuli that have wide bandwidths and similar fluctuations. This is because these models are energy—based models that have no mechanism to analyze interaural parameters differences over short intervals. The results of this dissertation have the potential to motivate several new studies in the area of incoherence detection. For example: o More research will need to be done to understand why listeners confuse monau- ral and interaural fluctuations. The monaural confusion results can then be applied to false-alarm rates as measured in single-interval masking-level differ- ence experiments in reproducible noises (Gilkey et al., 1985, 1986; Isabelle and Colburn, 1991; Evilsizer et al., 2002). o It will be important to study the eflect of varying the center frequency of the in- coherent noises and apply the results to high frequency masking-level difference data (Bernstein and Trahiotis, 1992). o It will be important to vary the value of the interaural coherence to change the distribution of fluctuations of phase and level. The detection of incoherence might be different enough that results in this dissertation does not apply to other values of coherence. o It will be important to make a more sophisticated model to describe incoherence detection data. This can be done by modeling brainstem nuclei firing patterns in the superior olivary complex. 0.2 Coherence Interaural coherence is a measure of the similarity of signals between a listener’s two ears. It is derived from the interaural cross-correlation function, 7(6), which is a function of the interaural time shift 6, 7(5) ___ foT 1'L(t)1'R(t + 5)dt x/it‘xandtig‘aamt’ where x}; is the signal in the right ear, and 2:1, is the signal in the left. The cross- (1) correlation is bounded by —1 g 'y _<_ 1. With respect to perception, interest normally centers on the peak of 7(6). The § 59 :9 -3 0 +3 Si nal from LAG 8(ms) Signal from le car right ear Figure 1: The Jeffress coincidence matrix. The characteristic frequency of bands (horizontal lines) increase logrithmically. The vertical arrows represent fibers at- tached to neurons (black dots) that fire when signals from both the left and right ears temporally coincide for different values of the interaural lag 6. (From Hartmann, 1999) value of 6 for which the peak occurs is the perceptually relevant interaural time difference (ITD) cue for the location of the sound image. This value of 5 was given a place representation in the famous binaural model by J effress (1948). Figure 1 shows how the Jeffress coincidence model is composed of different bands with a certain characteristic frequency. Different cells fire depending on the ITD and frequency of the signals presented to the ears. The height of the peak is thought to determine the compactness of the image. If the sounds in the two ears are identical except for an interaural delay or interaural < Wfi? Figure 2: An artist’s rendition of incoherence perception. The left figure shows a compact, coherent stimulus between the listener’s ears. The right figure shows how the image is broader and more diffuse for an incoherent stimulus. level difference, then 7 has its maximum value of 1, and the image is expected to be maximally compact. If the height of the peak is less than 1, the image is broader or more diffuse (Barron, 1983; Blauert and Lindemann, 1986). Figure 2 shows a cartoon of the perception of incoherence. The height of the cross-correlation peak is defined as the coherence. It is pos- tulated that to be physiologically relevant, the peak must occur at a value of the interaural lag that lies in a range of perceptible ITDs (:t 763 us for the average human head, which is determined by dividing the width of the head by the speed of sound), and this requirement may place restrictions on the peaks for which the concept of coherence is valid. As applied in architectural acoustics, the coherence is called the IACC (interaural cross-correlation), and it refers to the height of 7(6) for 6 in the range —1 S 6 S 1 ms (Beranek, 2004). In recent years this objective architectural measure has been divided into two separate measures. One is the apparent source width (ASW), based on co- herence within 80 ms of the onset of a sound (Barron and Marshall, 1981). The other is listener envelopment (LEV), determined by the coherence of later arriving sound, as measured after 80 ms (Bradley and Soulodre, 1995; Barron, 2004). Together, the ASW and LEV greatly influence the spatial impression of a sound in a room. Nor- mally, the architectural measurements are made with microphone techniques and not with artificial heads. Perceptual aspects of cross-correlation function and interaural coherence have been 7 studied by psychoacousticians, usually using bands of noise as stimuli. Noise provides an abstraction of real-world sounds that is devoid of meaningful information, and it affords many opportunities for parametric variations. Using broadband noise, Pol- lack and 'I’rittipoe (1959a,b) found thresholds for changes in cross-correlation for two values of the reference correlation, namely 1.0 (i.e. No) and —1.0 (i.e. N1r). They explored the effects of duration, sound level, frequency range, and interaural level difference. Listeners are particularly sensitive to deviations from a reference correlation of 1.0. Using narrowband noise, Gabriel and Colburn (1981) found that listeners could easily distinguish between noise with a coherence of 1.0 and noise with a coherence of 0.99. They also reported the somewhat counterintuitive result that as the bandwidth of the noise increases, the just-noticeable difference (jnd) also increases. One might have expected the jnd to decrease instead given that a wider bandwidth generally offers the listener more information. For a reference coherence less than unity, a listener’s ability to discriminate a difference in incoherence degrades appreciably.’ This decrease is approximately an order of magnitude for a reference coherence of 0.0 (Gabriel and Colburn, 1981). It is possible for the signals to the two ears to become anti-correlated (when the signal in the left ear is equal to the signal in the right ear except for a difference in the sign). An anti-correlated signal places an image at both of the listener’s ears, rather than at the center of the head. Boehnke et al. (2002) found that discrimination from a reference coherence of —1.0 was significantly worse than a reference of +1.0. It was also found that discrimination from a reference coherence of 0.0 was worse for partially anticorrelated noise when compared to partially correlated noise. A reference coherence of 1.0 is also of interest in connection with the masking- level difference (MLD). The MLD phenomenon is the difference that occurs in the threshold to detect a tone in noise depending on the binaural masker and signal phase configuration. Numerous studies have concluded that the threshold signal-to-noise ratio in the NoS1r (noise in-phase, signal out-of-phase by 180°) condition is essentially determined by the ability to detect the incoherence introduced by the out-of-phase signal (Wilbanks and Whitmore, 1967; Koehnke et al., 1986; Durlach et al., 1986; Bernstein and Trahiotis, 1992; van de Par and Kohlrausch, 1995). The concept of coherence may be important to speech recognition, as discussed by Culling et al. (2001). The line of reasoning is as follows: Jain et al. (1991) showed that the reduction in the interaural cross-correlation function of the signal is monotonically related to the strength of the signal for signal intensities less than that of the masker. A crucial factor in identifying several speech sounds is an accurate estimation of the frequency of the first formant. It is thought that the accuracy of the estimation process is determined by relative intensities of peripherally-resolved harmonics in that frequency region. If this is the case, then the ability to discriminate different degrees of decorrelation might explain the binaural intelligibility level difference (Licklider, 1948; Carhart et al., 1969a,b; Levitt and Rabiner, 1967a,b; Bronkhorst and Plomp, 1988). The binaural intelligibility level difference is the difference between using monaural and binaural information when trying to understand speech in a noise background. 0.3 Fluctuations of interaural parameters Despite the prevalent use of the cross-correlation function to describe incoherence detection, time-dependent fluctuations seem to play a part in the perception of inco- herent stimuli. This perception, which occurs when incoherent stimuli are presented with headphones, is usually described as including an auditory width or lateral dis- placement that varies in time. Given that the form of Eq. 1 is a “time-averaging” function that integrates over the entire duration of the stimulus, as opposed to a time-varying function, there appears to be a disconnection between the perception of incoherence and the measure of it via Eq. 1. From the extensive work on modeling incoherence detection and the present knowl- edge of auditory physiology, it seems possible that the interaural parameters are more physically relevant than incoherence per se to the incoherence detection task. Fluc- tuations in interaural phase differences (IPD) and interaural level differences (ILD) have been studied for a few conditions. For example, probability distributions of IPDs and ILDs have been calculated by Zurek (1991) and Breebaart et al. (1999); they have been measured in rooms by Nix and Hohmann (2001). 0.4 Physiology The pathway by which physical signals are transformed into neural signals starts with the acoustic wave impinging on the tympanic membrane (ear drum). The wave is then transferred via the ossicles (incus, malleus, and stapes) to the cochlea. The ossicles act as an impedance matching device; if the sound met the oval window directly, only 10% (or less) of the acoustic wave energy would be transferred to the inner ear or cochlea for a 1 kHz tone. Instead 65% of the acoustic wave energy is transferred (Moller, 1965). The cochlea is filled with fluid; the acoustic wave, via the middle-ear ossicles, causes pressure waves in the fluid, which, in turn, cause the basilar membrane to move. Attached to the basilar membrane are hair cells, which depolarize under mechanical stress and stimulate the generation of action potentials by the auditory nerve. Measurements of vibration patterns of the basilar membrane show that it is tonotopz'cally organized or organized by increasing frequency. All of the auditory system is tonotopically organized. From the cochlea, information travels on the auditory nerve. The cochlear nucleus is subdivided into the sections: the anteroventral nucleus (AVCN), the posteroventral nucleus (PVCN), and the dorsal nucleus (DCN). The AVCN projects to both the ipsilateral and contralateral superior olivary complexes and therefore is important to providing binaural information such as ITDs and ILDs, which are used in sound 10 localization. The DCN projects only contralaterally and therefore is important to monaural acoustic information such as level discrimination. The DCN processes the signals with non-monotonic firing rate-intensity functions and inhibitory sibebands. The PVCN is thought to perform both monaural and binaural processing because it has characteristics of both the AVCN and DCN. Unlike the DCN, the AVCN does little processing of the signals and projects to the superior olivary complex with myelinated fibers that keep the timing information intact. It is well known that the superior olivary complex processes time and level differ- ence between the ears in mammals. Specifically, Goldberg and Brown (1968, 1969) found that the medial superior olivary nucleus (MSO) in dogs analyzes time differ- ences. This is done with excitatory-excitatory (EE) cells that fire when impulses are received at the same time. Tsuchitani and Boudreau (1966) and Tsuchitani (1977) found that the lateral superior olivary nucleus (LSO) in cats analyzes level differences. The LSO contains mostly excitatory-inhibitory (EI) cells. Yin and Chan (1990) and Smith, Joris, and Yin (1993) showed that it is plausible for the M80 to act as the J effress coincidence matrix seen in Figure 1. They discovered delay lines from the cochlear nucleus to the EB cells in the M30 of cats. However, new work on the necessity of including EI cells in the M30 to accurately measure time diflerences has recently presented some problems to the Jeffress model (Brand et al., 2001). Measurements by Goldberg and Brown (1968) showed that the MSO contains approximately 75% EE cells and 25% El cells. However, the results of Brand et al. (2001) imply that a majority of M80 cells need inhibitory inputs. In either case, an additional inhibitory component to the Jeffress model appears both physiologically possible and pertinent. Beyond the superior olivary complex, neurons project to the nucleus of the lateral lemniscus (NLL), then the inferior colliculus (IC) where auditory maps of space are made (Knudsen and Konishi, 1979), then the medial geniculate body (MGB), and 11 finally the auditory cortex. The processing of stimuli becomes increasingly complex and is still being researched in these areas. Given that interaural parameters are processed in the midbrain and that detection of incoherence is perceived as time- varying fluctuations, investigating the tie between incoherent stimuli and interaural differences is both psychologically and physiologically motivated. 0.5 Purpose of this work This dissertation is concerned with incoherence detection starting with perfectly co— herent noise as a reference. Its working hypothesis is based on the suspicion that the extreme sensitivity shown by listeners to small amounts of incoherence is not properly described by the cross-correlation function. The reason is that when a small amount of incoherence is added to an otherwise perfectly coherent noise, the image of the noise acquires lateral fluctuations that are not present when the coherence is 1.0. The hypothesis continues with the observation that, whereas coherence is a measure that is averaged over time, the fluctuations that are imagined to be the basis of co- herence discrimination are dynamic. Thus, although the coherence measure may be a mathematically useful characterization of the similarity or dissimilarity of signals in the two ears, this measure may not be the most perceptually relevant characteriza- tion. Instead, it is possible that some measure that specifically considers fluctuations is better. A similar point of view with respect to the MLD has been taken by a number of authors, e.g. Jeffress et al. (1956). The rest of this dissertation describes experiments that test this hypothesis. 12 1 INCOHERENCE AND FLUCTUATIONS OF INTERAURAL PARAMETERS 1.1 EXPERIMENT 1: NARROW BANDWIDTH The purpose of Experiment 1 was to determine whether the height of the peak of the cross-correlation function (the coherence) adequately describes incoherence detection given a reference coherence of 1.0. The experiments employed reproducible noises (to be called left-right noise-pairs, or noise-pairs), as have been used in MLD experiments by Gilkey et al. (1985, 1986), Isabelle and Colburn (1991), and Evilsizer et al. (2002). An advantage of using reproducible noises instead of randomly generated noises for incoherence detection is that reproducible-noise data should provide a more stringent and confining test of the binaural models. This is because the individual properties of the noise-pairs will be accessible to the experimenter after the experiment is per- formed. An advantage of using incoherence detection data instead of MLD data is that incoherence detection employs a simpler stimulus, which is spectrally homoge— neous unlike a tone-in-noise situation. Certain problems may arise for MLD tasks that would not during incoherence detection; Wightman (1971) noted inconsistencies in narrowband MLD results that may have been due to off-frequency listening. In Experiment 1 and all the other experiments in this chapter, the different re- producible left-right noise-pairs had the same value of interaural coherence. If the coherence is an adequate measure of perception, then all the reproducible noise-pairs will be equally distinguishable from perfectly coherent noise. 1.1.1 Stimuli A collection of 100 two-channel noises with reproducible amplitudes and phases was created for Experiment 1. The process began with two waveforms, A and B, written 13 as a sum of cosines in the form N xA(t) = 2 C: cos(w,,t —l— (25$) n=1 and N xB(t) = 2 Cf cos(w,,t + 9255) n=l where the Cn’s and (pn’s are the amplitudes and phases of the spectral components. The narrowband noises were generated with components having random phases over a frequency range of 490—510 Hz and with a frequency spacing of 2 Hz. Components between 495—505 Hz had equal amplitudes of unity. Frequencies below 495 Hz had a raised-cosine window applied to the amplitude spectrum of the form 7r(fin — 490) cAchz~2 ] for 490 g f, g 495 (2) to minimize any spectral edge effects in the noises. The amplitudes of components from 505—510 Hz were similarly windowed. This frequency shaping can be seen in Figure 3. The 3-dB bandwidth of the noise was therefore 14 Hz. For each noise in a collection of 100 reproducible noise—pairs, the B noise (x3) was orthogonalized to the A noise (xA) by the Gram-Schmidt orthogonalization pro- cedure, which was used in Culling et al. (2001), and which is described below. The orthogonalized B noise is here denoted as x’B. This was done to ensure that the signals were uncorrelated and that the final value of the cross-correlation after mixing would be precise. For large bandwidths, the large number of random variables (the ain’s are uniformly distributed) yields statistically independent noise-pairs. In the case of large noise bands, orthogonalization is probably not necessary. However, there are a small number of random variables (random phases) for the 14-Hz bandwidth noises of this experiment. Experimental observations showed the projection between randomly 14 1.50 I‘IY'IVVV'VVVV'VYV—VIVVVVIVVVV 1.25 VV‘IYf71 0‘: 0.75 V V V ' V V V V 0.50 f l 0.25 '- 435 490 495 500 505 510 sis f (Hz) Figure 3: The frequency shaping in Eq. 2 that is applied to all stimuli in Experiment 1. This raised-cosine window was applied to remove any spectral edge effects. generated noise-pairs can be as large as 10% for this bandwidth. Therefore, it was im- perative to use the orthogonalization procedure for the 14—Hz bandwidth noise-pairs in order to be able to generate noise—pairs with the desired interaural coherence. To orthogonalize x B to x A, the root-mean-square (RMS) is found for each channel: Arms _ _\/Zt=M1[$A(t) ]2=\/Zt: lizn: 10A C03(07th + ¢R)l2 l % "2::1(C;;‘)2, __¢Zfiw1lxaltfl_\/Z. 1523.: 03.0.6... + 4310] =i.l >30. where M is the number of samples in each sound. Then the overlap between the two and channels is calculated: 2,11 xA(t)xB(t) _ Ex, 22;] Cfo cos(w,,t + 43:?) cos(w,,t + 45,?) ”AB _ MAm,B,m, ‘ MAmBm, 1 N CACf,3 cos(¢A— (:33) 5 "a A rm,B,,,, 15 Next, the correlated component of x A is subtracted from a scaled x3 ensuring the two channels have equal power, , (t) _ Arms pAB _ t _ _— Brrnsv 1 - pinB( ) V 1 - p.248 The two perfectly uncorrelated noises were then mixed, with mixing factor a, to xA(t) for 1 S t g M. create the final left and right (L and R) noise-pairs to be sent to the listeners, $1,223}; x3: (1—a2)xA+ax'B. The coherence, as defined by Eq. 1, is then computed to yield, P=7(0)=V1—02~ (4) For values of p near 1, the value of interaural lag that maximizes p is zero. Because the mixing factor used in all of the experiments was a = 0.125, the interaural coherence of all the noise-pairs was p = 0.9922. After mixing, each noise was given a time interval shape with a total duration of 500 ms. A temporal window, s(t), was applied such that there were cosine-squared edges with rise/fall times of 30 ms and a full-on duration of 440 ms, which can be seen in Figure 4. The application of the temporal window could change the coherence of the noise; therefore, the value of the coherence was measured after the window was applied so that p = 0.9922 i 0.0001. Noise-pairs that did not meet this criterion were rejected. For the 100 noise-pairs accepted in this experiment, 875 were rejected. Also, note that after orthogonalization and applying the temporal window, the noise-pairs do not necessarily have equal amplitudes. To determine the time-dependent interaural phase difference (IPD) and interaural 16 l.50 " r I 'l' VI I I I '1 I 'r' r 1.25:- 1 C I l _- : """"""""""""""""""""""""""" . ‘ . 5 i l A : l I 3 075 -: EI- 0’ :5 :1 0.50 _-: 5- :5 E: 0.25 1 :- I. 0‘ h: I‘ - i l t 0 AAAAAAAA 1AlAA‘AAAAIJJAAIMAAIAAAAIAA-Al ........ o 50 100 150 200 250 300 350 400 450 500 1 (ms) Figure 4: The temporal window, s(t), that is applied to all stimuli in this chapter. All the stimuli had 30 ms rise / fall times and a full-on duration of 440 ms. level difference (ILD), the analytic signals were found. By eliminating the negative frequencies, the analytic signal for either x L or x]; is N W) = 8(t) 2 Cu eXP[i(wnt + 4%)]. n=1 where s(t) is the temporal window, the Cn’s are the left or right amplitudes, and ¢n’s are the left or right phases as required. By Euler’s relation, the analytic signal becomes :t(t) = s(t) Z Cn[cos(wnt + 6,.) + isin(w,,t + 4%)] = éR(t) + 28(t). (5) n=1 The phase and enve10pe of the analytic signal as a function of time are $0) = arglglt), 5%)] (6) and =\/3i2(t) )+ 32(t (7) 17 where the arg function is the arctangent with possible quadrant correction. The time-dependent IPD (radians) and ILD (dB) of the analytic signal are then defined as MW) = $30) - ¢L(t)- (8) and AL(t) = 2010g10 [53%f3] . (9) Equation 8 yields a positive value of A(t) for signals that lead in the right ear. Similarly, Eq. 9 gives a positive value of AL(t) for a signal that has a larger level in the right ear. The interaural phase A(t) was required to remain in the physically relevant region of in radians at every point in time, and was corrected by adding or subtracting 27r when necessary. The ITD is proportional to the IPD for narrow bandwidths. The ITD can be calculated by dividing the IPD by the angular center frequency of 27r x 500. 1.1.2 Signal Structure Analysis Figure 5 shows a plot of the IPD and ILD as a function of time for an arbitrarily chosen noise-pair from the collection of noise-pairs with p = 0.9922 and a bandwidth of 14 Hz. The figure shows that the IPD and ILD fluctuate as a function of time. Peaks in the IPD and ILD sometimes coincide in time and can be directed towards the same ear or opposite ears. An analysis was done to find how often the interaural differences were in the same and different lateral directions. This was done by comparing the direction of the interaural differences for each of the 4000 samples in all the 100 noise-pairs from this experiment. The percentage of time that the interaural differences were in the same direction was calculated for the average noise—pair (averaged over 100). In addition, the individual noise-pairs with the minimum and maximum percentage of samples 18 s,(AL)=1.87 dB st(A¢)=l 1.76 degrees 60 r ' ' ' ' T 1 'fi , 15 I I I ffi T I 40 E l to :- 5’7 L . U, I :3 ° W 3 a i -10 E 1 1 l 1 4 ’ _15 ’ 1 I l 1 1 1 l l l ’ 250 500 0 250 500 t (ms) t (ms) Figure 5: The IPD and ILD plotted as a function of time for an arbitrary 14-Hz bandwidth noise-pair. Above the plot are the fluctuation values of the IPD and ILD. Positive values of the IPD are defined as leading in the right ear; positive values of ILD are defined as being louder in the right ear. that had interaural differences in the same direction were found; so were the noise- pairs that had interaural differences in opposite directions. Lastly, it was found how often the IPD was large when the ILD was large, irrespective of lateral direction. Each parameter was defined to be large when it was greater than the respective RMS value of the interaural differences over time. Table 1: Average (over 100 noise-pairs), minimum (over 100 noise-pairs), and max- imum (over 100 noise-pairs) percentage of samples where IPD and ILD are directed towards the same ear or different ears. It also shows when the IPD and ILD are large with respect to the RMS interaural difference. Average Minimum Maximum Same direction 46.3% 19.4% 72.9% Opposite direction 48.8% 20.1% 74.8% Large in both 43.0% 20.8% 82.0% Table 1 shows that for the 100 noise-pairs in this experiment the IPD and ILD can be either in the same direction (both towards the right or both towards the left) or in the opposite direction (one to the right, one to the left). The IPD and ILD are in the same direction and opposite directions about 50% of the time for the 19 average noise-pair. The total of 46.3% and 48.8% do not add up to 100% because either the IPD or ILD is zero 4.9% percent of the duration for a typical noise—pair. However, there are improbable noise-pairs where the interaural differences go in the same direction as little as 19.4% of the duration and as much as 72.9%. This is similar to the percentages for the interaural differences going in different direction. Possibly the most interesting part of this analysis is whether a large peak in one interaural difference implies a large peak in the other interaural difference at the same sample number. Figure 5 shows a large peak (greater than the RMS value) in IPD going to the left ear at 225 ms and large peak in ILD going to the right at the same time. On average, only 43.0% of the large peaks in IPD or ILD correspond to large peaks in both IPD and ILD. The most extreme stimuli have peaks that coincide as little as 20.8% of the time and as much as 82.0% of the time. Therefore, there is a great amount of variability in individual noise-pairs with respect to fluctuations of IPD and ILD depending on the phase relationship. 1.1.3 Signals Next, a quantitative measure of interaural fluctuations is defined. The IPD fluctuation over time is defined as stlA‘Pl = (l i 2WD“) - 371312 (10) where M is the number of samples in the 500 ms noise (M = 4000 samples), 375 is the mean IPD computed over time, and the t subscript indicates that this standard deviation value is computed over time. The ILD fluctuation over time is defined as 8‘[AL]:\J'1il4-i[AL(t)— AT]? (11) Figure 5 shows the IPD and ILD changing as a function of time, with values of st[A] 20 ID v . r ' f1 ' ' ' l ' r I I .. s :- ® -: to F. .5 3m ' 1 g : .336 1. .—.o 5 ON «I 3 : 079 SW = 14 Hz .- o mama) = 10.50 ' - u(s,[AL]) = 1.45 1 _ o(s,[A¢]) = 6.66 . ‘0 o(s,[AL]) = 0.68 1 corr‘ = 0.70 1 o . I l l l l l l l l E I Ml L4 L4 J_L L J j_A L 0 1 2 3 4 5 Sf[AL] dB Figure 6: Fluctuations of IPD versus fluctuations of ILD for the collection of 100 reproducible noise-pairs having a 14-Hz bandwidth, as used in Experiment 1. Each noise-pair is labeled by a serial number indicating only the order of creation. The means, standard deviations, and IPD-ILD correlation of the distributions are re- ported. and st[AL] above each panel. For the entire collection of 100 14-Hz bandwidth noise- pairs, Figure 6 shows st[A] plotted against 3, [AL]. The mean, standard deviation, and correlation of st[A] and st[AL] are reported on the figure. This figure shows that st[A] and st[AL] are strongly correlated. However, there are still some improbable noise-pairs, which were predicted to occur from the signal structure analysis of the previous section. For example, noise-pair #11 has the largest IPD fluctuations but only average ILD fluctuations. (It is interesting to see that the correlation between phase fluctuations and level fluctuations across different noise-pairs is fairly strong, 0.70. However, as shown in Appendix B, a value of 0.70 is somewhat low relative to a value computed for 5000 noise-pairs and for this bandwidth.) Numerical studies were done to compare the equal-amplitude noise-pairs used in this experiment to Rayleigh distributed noise-pairs. It was found that there was no 21 difference in the probability density functions of the IPD and ILD over time. It was also found that using Rayleigh distributed noise-pairs did not affect the distributions of st[A] and st[AL]. To perform Experiment 1, the five noise-pairs with the greatest fluctuations of IPD and the five noise-pairs with the smallest fluctuations of IPD were selected to form a phase set of ten reproducible noise-pairs. Similarly, the ten noise-pairs with the greatest and smallest fluctuations of ILD were selected to form a level set. The convention in this dissertation is to plot detection data of phase sets with circles and level sets with boxes. The two channels of noise were computed and downloaded by a Tucker-Davis AP2 array processor (System II) and converted from a digital signal by 16-bit DACs (DDl). The buffer size was 4000 samples per channel and the sample rate was 8 ksps. The noise was lowpass filtered with a corner frequency of 4 kHz and a —115 dB / octave roll off. The noises were presented at 70 dB :l: 3 dB with levels determined by programmable attenuators (PA4) operating in parallel on the two channels. The levels of the two channels were equal and were randomized over a range of i3 dB for each of the three intervals within a trial in order to discourage the listener from trying to use overall level cues to perform the task. 1.1.4 Procedure Listeners were tested individually, seated in a double—walled sound attenuating room, and using Sennheiser HD414 headphones. Six runs were devoted to listening to a set of ten reproducible-noise pairs. A noise-pair could be presented either incoherently - the dichotic presentation of x], and 3:3 - or it could be presented coherently - the diotic presentation of xL. A run consisted of 60 trials, where each of the ten reproducible noise-pairs in a set was presented incoherently a total of six times. Thus, listeners heard an individual noise-pair incoherently a total of 36 times (six runs times six 22 Figure 7: A pictorial representation of the three-interval sequence used in this exper- iment. The first interval was always coherent, represented by the straight line on the far left. The second interval can be either incoherent or coherent. In this figure the second interval is incoherent, represented by a fuzzyness or increased width between the ears. The last interval needs to be opposite from the second interval. Therefore, for this example, it is a coherent, compact image. presentations per run). On each trial the listener heard a three-interval sequence, as illustrated in Figure 7. The first interval was the standard interval, which was always a coherent noise. The second interval was randomly chosen to be either incoherent or coherent. The third interval was the opposite of the second (e.g. if the second interval was coherent, the third interval was incoherent). The two coherent presentations were randomly selected from the remaining nine reproducible noises in the set except that they were required to be different from the incoherent “odd” interval and to be different from one another. The inter-interval duration was 150 ms. Listeners were instructed to “choose the interval that was different from the other two.” Initial training with noise-pairs with a small values of coherence (i.e. p = 0.95) were used so that the difference between incoherent and coherent stimuli would be obvious. Feedback was given to the listeners so that the task could be properly learned. The value of coherence for the noise-pairs was increased to a final value of 0.9922 as listeners became better at identifying incoherent noise-pairs. The training was finished when listeners could identify the incoherent noise-pair approximately 75% of the time for noise-pairs with a bandwidth of 136 Hz. 23 ®@@® Figure 8: The response box used in the experiments. 1 . 1 .5 Data collection Listeners used a four-button response-box to make decisions as seen in Figure 8. Four buttons were used so that the listeners could respond to the correct interval with a confidence estimate. The buttons from left to right were 2!, 2, 3, and 3!, representing confident second interval, second interval, third interval, and confident third interval respectively. Listeners were instructed to use a confident response only if there was no uncertainty as to which interval was incoherent. If a run included more than, one incorrect confident response, the run was terminated immediately and the listener was obliged to replace this run. There was no time limit for a response. After a decision was made by the listener, the next trial began following an inter-trial duration of 900 ms. There were several reasons to introduce the confidence measure in this experiment. The first was that for a given coherence, it proved to be noticeably easier to detect incoherence in some stimuli. Thus, extra weighting was wanted for identifying obvi- ously incoherent sounds. The second reason was that it was necessary to use the same waveforms, with a coherence of p = 0.9922, for all the listeners, and it was further de— sired to use the same coherence for waveforms with different bandwidths. However, some listeners were better at the task than others, as shown by the percentage of correct (PC) responses, and some bandwidths led to better performance than others. Consequently there was a need to increase the “dynamic range” of the experiment to 24 prevent ceiling effects for the most successful listeners and easiest bandwidths. The data collection procedure kept track of both the percentage of correct re- sponses, which ignored the confidence estimate (e.g. responses of 2 and 2! were not treated differently), and a confidence adjusted score (CAS). The CAS is defined as the number of times the listener responded correctly plus the number of times that the listener was confident about the correct response. Since an individual noise-pair was heard 36 times, it was possible for a listener to get a score of 72 if the listener was able to respond correctly and confidently for all 36 presentations. In comparison with PC, the use of CAS improved inter-listener correlation, moved p—values of t-tests to greater significance, and improved the agreement that was achievable by models of binaural processing. Phrther justification of this technique is given in Appendix A. 1.1 .6 Listeners The experiments in this chapter employed four male listeners, D, M, P, and W. Listeners D, M, and P were between the ages of 20—30 and had normal bearing according to standard audiometric tests and histories. Listener W was 64 and had a mild bilateral hearing loss, but only at frequencies four octaves above those used in this experiment. Listener M is the author. 1.1.7 Results Figure 9 shows the selected P, values and Figure 10 shows the selected CAS values for the phase set in Experiment 1. The five smallest st[A‘I>] noise-pairs are to the left of the dashed line. The five largest st[A] noise—pairs are to the right of the dashed line. The dashed line thus represents a gap of 90 unused noise-pairs. All four listeners show, in general, a greater P, and CAS for the largest s¢[A] noise-pairs compared to the smallest. The statistical test that will be used to test the hypothesis that the five noise-pairs 25 with the largest fluctuations in phase have detection scores that are indistinguishable from the five noise-pairs with the smallest fluctuations in phase is the two-sample t-test. If there is a difference between the largest and the smallest fluctuation noise- pairs, it will be considered significant when there is at least a 95% level of confidence (the p-value is 0.05 or less). It will be considered more significant if there is at least a 98% level of confidence (the p-value is 0.02 or less). Significant differences would mean that the value of the coherence function does not describe the detection data from the phase and level sets. The t-test will be used on both the phase and the level sets that are selected in the first three chapters of this dissertation. In a two-sample t-test, three of four differences were significant at the 0.05 level for the P, data. All four differences were significant at the 0.02 level for the CA5 data. The individual p—values are shown in Table 2. Figures 9 and 10 also show that listeners tend to agree upon the difficultly of detecting incoherence in individual noise-pairs. Table 3 shows the correlations between listeners; all of them are equal to or greater than 0.391 for the P, data and are equal to or greater than 0.769 for the CA5 data. Table 2: The p-values from a one-tailed t-test for 14-Hz bandwidth data. The test compared the five noise-pairs with the largest fluctuations to the five noise-pairs with the smallest fluctuations. The p-values with at least a 95% level of confidence are in bold. P, Phase Level CA5 Phase Level Listener Set Set Listener Set Set 0.095 0.006 <0.001 <0.001 0.021 0.047 0.002 0.002 0.025 0.007 0.011 <0.001 0.015 0.014 0.001 <0.001 gnaw Evan Figure 11 shows the P, values and Figure 12 shows the CA5 values for the level set for Experiment 1. The figures show results similar to Figure 9 and 10 for the phase 26 s,[A] degrees 2.67 3.18 23.51 40.72 I I l I 100 - - 3 - 6 - - d—cb - t .\ ’1g§\ :Illg”<‘o\’\: ’1 : 7 \ ’ I‘ \ ’ ‘ I O \ III “\v I ' \ [I a 90 .- \ ‘4 ‘0’ g ‘0 1 : \\ ‘s ‘ VD, ' : n \ 1’ ‘ | 1 so - ‘0‘ n .. u I : I 0— : | j 70 - : -l C l 1 C i O Listener D I 50 f : O Listener M 1 : i : 50 - I - loo:- , p”"O-HO_M€HHO 1 i q l , , x 1 p ’I \ I I I \\ J 90 :- l’ \ ll, ,‘~ I, \ 1 : Q \\ ’I ,’ “\ I V 4 . P I \ ' I \ ’ i 80 '- ’ \\ ’ \ l’ ’l ‘ I d I- I I ‘ b I | \ I d U l I ’ s‘ / \ I d D. t I \‘ 1’ .. i \ I ‘ 70 . a” ' ‘ I’ | ‘ P- I \ - : ,I b’ l : : II i O Listener P I 50 f , ,0 : O Listener W 1 : 0” I i D | d - I . so I l I l I l l l l L 20 84 4o 99 86 1 44 57 34 ll Noise-Pair Number Figure 9: The percent (P,) for five listeners for the phase set of Experiment 1, the 14- Hz noise-pairs. The noise-pairs were chosen to have the smallest and largest s,[A] in the collection of 100 noise-pairs. The noise-pairs are rank ordered by increasing s,[A] along the horizontal axis. The vertical dashed line represents 90 unused reproducible- noise pairs. The P, values are higher for noise-pairs with the largest s¢[A] than for noise-pairs with the smallest s,[A] for all listeners. The plots of P, scores vs. noise- pair serial number show a large measure of agreement among listeners. 27 s+[A] degrees 2.67 3.18 23.51 40.72 I l T I I 17 I I --_ 3.21 : 43‘“: ‘ 60 b : "Tare ~04” - 55 - i I, I” “OI " I 32 ' a? 'l p" ‘ I— I g 40 - .\ I’,’ "\~ ’ l/ U Q \\ ,0’1’ \\“ 1" :3 :- “\ ‘_I.IJ.. bl : -1 25 - U : :1 20 __ i O Listener D ' O Listener M 15 - : - 70 '- 9‘-‘ cl 1 I ‘0 55 ' : '9‘ ~ p ' I \ g: I. : ” U I, \\ - 50 '- : ” IA I’ \\ q 45 h : ’l’ \\ I, ‘ u: ’2 40 .Q l a“ \‘ ' I- I \ ’f’ \ I cl 0 35’- z’.\ ’0’ ">\".’ Ii” \ I - :2 _. " I’v‘v’x U ’ 20 .- o——"‘d : O Listener P ' O Listener W 15 - : 10 '- l l I I l ' l l l l l 20 64 4o 99 66144 57 3411 Noise-Pair Number Figure 10: The confidence adjusted scores (CA5) for five listeners for the phase set of Experiment 1, the 14-Hz noise-pairs. The CA5 values are higher for noise-pairs with the largest s,[A] than for noise-pairs with the smallest s,[A] for all listeners. The plots of CA5 scores vs. noise-pair serial number show a large measure of agreement among listeners. 28 s,[AL] dB 0.19 0.53 2.73 2.96 I I I I 100 _- ,fi.’ p , ’-;, 41k 1': 5| 1 r I, l‘k ‘ " 'I- II ‘ ’ i I l 90 I: d, I \ \\fi’ 3” l [I d ’ \ \ l 1 ; i” ‘ 1’ ‘ ‘ i" t D \ \ 1 80 '- \ I \ .1 U r 1 I’ \ If " F | d 0- : ‘\ I, \\ l“! d 70 " ‘\ 1 “ I : d ' I f \ ' I ‘ : 1 ’ \ ' I . 1 60 . 1 I, 1 f . I LIstener D . I- l ‘ III ; t] t. 1 Cl Listener M , ‘ C i 1 1 50 l' l - 100'- Qr-G:'&--fl---‘D ‘ b I ” ~.. 1 D I I I ‘\~ . 1:1 . 1" i----l 1 b I ' 1 9° ' ‘~ ./ 1 : \ II ,, D \ p ’I’ I . \ I \ ’ 30 . 1 [I X ’ If .1 \ U t \ I ‘\ d I : 1 CL 1- I \ I \ I l , -l L I \ \ I )I I . q 70 l- ,’ \ \ ’ \ I q 7 I \ \ I, I, \ I : d : ‘ I I . 1 . t ‘\ t] I d 1 I LIstener P . 60 1' \ i : C] Listener W 1 p ‘ I b \ I ' q I \ ’ ' 1 50 '- \ I ' d I I 1 I I I I I I I I I 4o 95 1‘ 97 ,’ 56 66 60 32 44 69 57 ‘, [Noise-Pair Number ‘I ‘I Figure 11: The percent correct (P,) for five listeners for the level set of Experiment 1, the 14-Hz noise-pairs. The noise-pairs were chosen to have the smallest and largest s,[AL] in the collection of 100 noise-pairs. 29 Sf[AL] dB 0.19 0.53 2. 73 2.96 70- I I I I I l ¢—--‘D--::9_--.fi-‘ I 55" I ’ ‘.‘~~-.__-:g I Ir—w‘I 60" l I '1, ,I I l l l 55F so - ,D ," :3- /:/‘\\\ n l I - I \ I \ 35- 9’ ‘4 l 7'1 \ I 30- \ I \\ ’ 25- \ ’ ‘ 20- 15- 10- I lJJl CAS 1 I Listener D 1 D Listener M 65 I- 60 - 55 - 50 - 45 b 40 " In 35 - U \ 30 l 25 20 15 10 7°” P----I:I---{I.. " ' n =:""""'- \ ‘ s \ I I I I i I I i I 1_L I J I I CAS I Listener P Cl Listener W \ I \ ITITI q [I I I I: \ l \ I EL l l l l I l l l l l 40 95 97 56 66 80 32 44 69 57 Noise-Pair Number Figure 12: The confidence adjusted scores (CAS) for five listeners for the level set of Experiment 1, the 14—Hz noise—pairs. The noise-pairs were chosen to have the smallest and largest s¢[AL] in the collection of 100 noise-pairs. 30 Table 3: The inter-listener correlations for 14-Hz bandwidth IPD data. It can be seen that correlations between listeners are near 1.0 for the CAS data. It can also be seen that using the CAS data increases the average correlation compared to the PC data. PC D M P W D 1 0.732 0.614 0.391 M — 1 0.421 0.796 P — — 1 0.429 W — — — 1 Average 0.564 CAS D M P W D 1 0.902 0.846 0.666 M — 1 0.769 0.970 P — — 1 0.787 w - — — 1 Average 0.857 set. As shown by Table 2, all the p-values are significant at the 0.05 level for the PC data and at the 0.02 level for the CAS data. Table 4 shows correlations between the individual listeners are all greater than or equal to 0.291 for the Pc data and increase to greater than or equal to 0.855 for the CAS data. 1 .1 .8 Discussion Experiment 1 shows that the peak of the cross-correlation function is not an ade- quate predictor of the detectability of incoherence for narrowband noise. Instead, the fluctuations in interaural phase and interaural level clearly play a role in incoherence detection. This can explicitly be seen in the p—values of the t-tests for the phase and level sets chosen in this experiment. The large inter-listener correlations of this experiment indicates that listeners agree in detail about the kinds of fluctuations that are easy or hard to detect. This correlation can be seen graphically in Figures 9—12 or numerically in Tables 3 and 4. 31 Table 4: The inter-listener correlations for 14-Hz bandwidth ILD data. Correlations are comparable to those for the IPD data. Once again, inter-listener correlations are higher for the CAS data compared to the PC data. PC D M P W D 1 0.291 0.707 0.447 M — 1 0.662 0.967 P — — 1 0.774 W — —— — 1 Average 0.641 CAS D M P W D 1 0.855 0.931 0.925 M — 1 0.885 0.974 P — — 1 0.933 W — — — 1 Average 0.917 It general, p—values moved to greater significance and inter-listener correlations increased when using the CAS values over the PC values in this experiment. Although it was probably not necessary to introduce the CAS to show conclusive results that the cross-correlation function is not an adequate predictor of the detectability of incoherence for narrowband noise, it was predicted that the CAS would be necessary for larger bandwidths. Lastly, it should be reported that some noise-pairs led to values of percent cor- rect that were well below chance. Through informal listening experiments, it was found that these noise-pairs had very little roughness or action in the envelope of their waveforms. It seems possible that when the fluctuations of IPD and ILD are barely detectable, listeners may sometimes mistake roughness or action for incoher- ence. They would then be reluctant to say that a particularly smooth sounding noise-pair is incoherent, which could lead to a PC less than 50%. Figure 13 shows the left and right channels for three sample noise-pairs. It appears that the enve10pe for 32 noise-pair B is smoother (more gradually sloped) or has less action than the other two noise-pairs. Listeners detected the incoherence in noise-pair B with a PC of only 27%. This can be compared to noise-pair A with PC = 96% and noise-pair C with PC = 71%. This idea of using monaural cues during incoherence detection will be more thoroughly explored in Experiment 5. 33 a g 5 t 4 E. 5 57’ >5" A h >. :3 ‘5: x x .10 h I I I I I I I I I ‘ _ l I I L I I I I I I l 0 250 500 100 250 500 t (ms) t (ms) 10 r T ‘ l0 ’ T T—I I fifi a 5 7 * z . S if 0 B > ‘c’: . X _5 _ . i C -10 ’ l l ‘ l — 0 ’ 1 I L 0 250 500 l 0 250 500 t (ms) t (ms) 10 ’ T I fi fi 10 I w I l . 5 r . . 0 l C :- -s:~ a E . E 1 _10 A I I I I I I I I _ 0 I I I I l l I I l 0 250 500 l 0 250 500 t (ms) t (ms) Figure 13: The left and right channels for three sample noise-pairs A, B, and C with p = 0.9922 and a bandwidth of 14 Hz. Each channel was normalized by the RMS value of the channel. It appears that B is smoother or has less action in it compared to the other noise-pairs, which led to the hypothesis that listeners may be using monaural cues during incoherence detection for noise-pairs with small interaural fluctuations. 34 1.2 EXPERIMENT 2: CRITICAL BANDWIDTH Experiment 2 was identical to Experiment 1 except that the bandwidth was increased from 14 Hz to 108 Hz, near a critical bandwidth at 500 Hz. The value of the coherence remained fixed at 0.9922. 1.2.1 Method As for Experiment 1, the geometric mean frequency was 500 Hz and the spectral spacing was 2 Hz. The bandwidth was increased to 108 Hz. As for Experiment 1, the spectral edges were 5 Hz wide. Therefore, components between 444 and 449 Hz and between 555 and 560 Hz were given a raised-cosine edge; components between 449 and 555 Hz had unity amplitude. The variation of the interaural parameters as a function of time can be seen in Figure 14. Notice that the fluctuations are approximately eight times faster when compared with Figure 5 because the bandwidth is approximately eight times larger. Note that with a larger bandwidth of 108 Hz, the conversion of IPD to ITD is not necessarily as straightforward as in the 14-Hz bandwidth case. However, given an IPD, the value of the ITDs is uncertain by only approximately 10% for this bandwidth. Another collection of 100 noise-pairs was used in this experiment. A phase set and level set were chosen from the collection, as in Experiment 1. Figure 15 shows st[A] versus st[AL] for the first collection of 108-Hz bandwidth noise-pairs. The black dots represent the noise-pairs with 14-Hz bandwidth from Experiment 1 for comparison. The means of the distributions of fluctuations increased slightly when compared to the the 14-Hz bandwidth waveforms, but the standard deviations over the noise-pair ensemble of the phase fluctuations decreased from 6.66 degrees to 3.72 degrees. The standard deviation of the level fluctuations decreased from 0.68 dB to 0.40 dB. The correlation between st[A] and st[AL], evaluated over the ensembles of 100 waveforms, was essentially the same, 0.73 compared to 0.70. Appendix B shows 35 s,(A¢)=l4.76 degrees s,(AL)=l.62 dB 50.f...... 15,rf......f, MI (degrees) N O O f 6:: 0 250 500 0 250 500 1‘ (ms) t (ms) Figure 14: The IPD and ILD plotted as a function of time for an arbitrarily chosen 108-Hz bandwidth noise-pair. Above the plot are the fluctuation values of the IPD and ILD. Note that when compared with Figure 5, the fluctuations are about eight times faster. that a value of 0.73 is in line with expectation based on statistics for 5000 noise-pairs. 1.2.2 Results Figures 16 and 17 show the PC and CAS values for the phase set for the 108-Hz noise-pairs of Experiment 2. A comparison between the phase set of this experiment and the phase set of Experiment 1 show that listeners are not nearly as good at distinguishing incoherence from coherence at this larger bandwidth. Similar results are seen for the level set in Figures 18 and 19. The Figures 16 and 18 show Listeners D and M near the ceiling of Pc for many of the ten noise-pairs and always above 75%. However, Listeners D and M do not reach the ceiling for CAS as seen in Figure 17 and 19. Listeners P and W are not so dramatically close to ceiling for Pc or CAS. Table 5 shows that only one of the four t-tests from the phase sets PC values led to differences that were significant at the 0.05 level between the results for the five largest fluctuations and the results for the five smallest fluctuations. This increased to three of four significant p—values when using the CAS values. One of the four 36 U'r'fvtirtf'IU'UUIVVffi m <- . ‘ i , 0 L O _1 <1- . = i In -_ : m . - I O 2 88 r 1 Q) : : bun -_ .: 8‘“ : : Ho L _: 0N . . S 1 1 59 L SW = 108 Hz : o i u(s,[A¢]) = 13.40 _ "' u(s,[AL]) = 1.73 1 ~ 0(st[A¢]) = 3.72 3 ‘0 r a(s,[AL]) = 0.40 - :. corr = 0.73 t o .4 I I I l I I I I I I I I I l I 141 I l I I I I E 0 1 2 3 4 5 Sf[AL] dB Figure 15: The fluctuations of IPD versus fluctuations of ILD for the collection of 100 reproducible noise-pairs having a 108-Hz bandwidth, as used in Experiments 2. Each noise-pair is labeled by a serial number indicating only the order of creation. The means, standard deviations, and IPD-ILD correlation of the distributions are re- ported. The black dots represent the 14-Hz bandwidth noise-pairs from the collection in Experiment 1. 37 sflAl degrees 3.21 8.50 21.51 24.25 I I I 100: 6‘ b—-|--‘~-,.5~“ l I ”.6 1 . \ ”\ lp‘y‘I‘O. ' l’ ‘--.H---. . : \ ’ \\ ’ I j- :1 90E ,’ \ \ 1:12. : 1 . \ D : 1 p \ I P I | 80 -_- b : i u . , 1 0- ; ' : I 70 _- , :- . I . I : O Listener D I 60 T : O Listener M j ' I I d 50 *- ' - 100 P- 1 1 I ' Q\ 9‘ ‘ t 1 I ‘\ [I \ : so - :,’ ‘0‘ , \ .. r I ‘ l 1 1- 4 \\ I \\ 4 i ’I ‘b/ ‘ 80 '- R ,’ I B - p . I I I ,~‘ 1” I a” z x 2.2.; ,o 1.! w. n z ' ‘1’ 4». l’ 1’ ‘ I ‘ 70 l: l"\ I, ‘ “‘0' , : \‘ l” 1 b O \\ I, \ I : ‘ d C ’ ‘ ,’ 1 O Listener P I b \ . 50 f 1 [I : O Listener W 1 ; ‘~ ' 1 ; b \ ” : ‘ 50 h I I l \u ’ l I I I l I - 49 9310 3' 64 73 9 4o 32 76 Noise-Pair Number Figure 16: The Pc results for the phase set of Experiment 2 with a 108-Hz bandwidth. The phase set was constructed by the same method as was used in Experiment 1. When compared to the 14-Hz bandwidth phase set in Figure 9, there is a much less drastic difference between noise-pairs with the largest and smallest fluctuations of IPD. 38 s,[A] degrees 8.21 8.50 21.51 24.25 T I l I I I I I I I 70 65 60 55 50 45 40 \ \‘7 35 ‘ ' - —O 30 25 - 20 - 15 1- 10 - 70- 1 65- : 60'- : 55" I 50- : ~ I I ) I j l ’ -O” 10'" I I I :::'g----.-——-O'"‘. CAS ‘8 ‘b 1 O Listener D O Listener M Q I I I I \4 \ I 0' \ ——-——-—-_--———\-\.-—--— l IL] L 45 - , o‘ .0. - 40 . ’ x ,I’ x 35 , . ‘0 ,4‘ x _ i- In I ,‘.’ ‘\8 30b "0.3,:8?‘~ ’, ’4".-~- ,’ - 25 _ \"’ \\ U , | .‘ 20 _ \ ,I 1 O Listener P "’ : O Listener W 15 - , I CAS I ‘1- O’ I 10- l J l l J l A l I l 49 9310 54 84 73 9 40 32 75 Noise-Pair Number Figure 17: The CAS results for the phase set of Experiment 2 with a 108-Hz band- width. The phase set was constructed by the same method as was used in Experiment 1. When compared to the 14—Hz bandwidth phase set in Figure 10, there is a much less drastic difference between noise-pairs with the largest and smallest fluctuations of IPD. 39 SIIAL] dB 0.91 1.15 2.50 2.66 100L- 151‘ ' T ' ' 4,45.-- 1 ,-—fl1~‘~ ' ,45 - ; \\ ’8‘ ’0’. ” ‘ ‘D’ ‘ I- \U”t‘.‘:‘ ’1 ' ” \\~' i".-___. . h ’, ~"1:. |I ’ d 90: I, \\ ’4’ :1 I ,’ \l’ I I go: I, I .1 U . I j 0. E : 1 7o:- : u I I I : I I Listener D I 60 T : CI Listener M 1 I : 1 50’- . .‘ 100: ' 1 I I ’D‘ j b | I \ -. 1 IV ‘ 3 90: Du J~’D\ \ i f I, ‘U7: \\ u\ ‘ 80L EL-‘ ,’ “,1 1 1‘ \ q .. : u. x - : ~~ x35 2 n_ I- ,r‘! ‘x ’ ‘ q . I, \ ‘g’ I : ‘ . 70>- \\ I, I \‘U 1 I- \ ’ I \ cl P \ I I i 1 50; i : ; : l I Listener P 1 C : El Listener W 1 50' I 1 I 1 I . l I J I I 1 49 93 54 5715 53 38 9 33 73 Noise-Pair Number Figure 18: The Pc results for the level set of Experiment 2 with a 108-Hz bandwidth. The phase set was constructed by the same method as was used in Experiment 1. When compared to the 14-Hz bandwidth phase set in Figure 11, there is a much less drastic difference between noise-pairs with the largest and smallest fluctuations of ILD. 40 7O 65 - 60 " \ 55 '- x I \ I 50 "' \ x I ‘ I ‘z‘ )1” 45 - \ I \B‘ M -I a a ’ ‘\ Ill 40'. 35r- ”t! I .I 30- 25- 20- 15+ 10*- 70I— - 65'- 50- 55- 50- 45' 40- CAS I I I l I Listener D : Cl Listener M I I CAS q‘......_.._..... \ I I \ l J J l \ 35- 33* 30 - Dan—330’ In” 259 3‘, 20- I5- 10- \ -T‘.‘ \ I ‘i—«i- “I I Listener P CI Listener W I \ 3U’ l l _--_--_--\. L l l l l l l l l l 49 93 54 5715 53 33 9 33 73 Noise-Pair Number Figure 19: The CAS results for the level set of Experiment 2 with a 108-Hz bandwidth. The phase set was constructed by the same method as was used in Experiment 1. When compared to the 14—Hz bandwidth phase set in Figure 12, there is a much less drastic difference between noise-pairs with the largest and smallest fluctuations of IPD. 41 t-tests from the level sets for the PC values led to differences that were significant at the 0.05 level. This increased to three of four significant p-values when using the CAS values. Table 5: The p-values from a one-tailed t-test for 108-Hz bandwidth data. Pc Phase Level CAS Phase Level Listener Set Set Listener Set Set D 0.061 0.029 D 0.024 0.007 M 0.085 0.060 M 0.019 0.047 P 0.248 0.226 P 0.187 0.187 W 0.005 0.069 W 0.002 0.026 Table 6: Inter-listener correlations for 108-Hz bandwidth phase set. Pc D M P W D 1 0.071 -0.044 0.634 M — 1 0.148 0.202 P — —- 1 0.337 W — — — 1 Average 0.225 CA8 D M P W D 1 0.422 0.053 0.668 M — 1 0.312 0.328 P — — 1 0.492 W - — — 1 Average 0.379 Further, as shown in Tables 6 and 7, the correlation between listeners was smaller for the wider bandwidth for all listener pairs. The correlation between listeners was 0.225 on average for the phase set for the PC data. This increased to 0.379 for the CAS' data. Likewise, there was an increase in inter-listener correlation for the level set, from 0.117 to 0.278, when using CAS data over the Pc data. 42 Table 7: Inter-listener correlations for 108-Hz bandwidth level set. Pc D M P W D 1 0.000 0.007 0.093 M - 1 —0.031 —0.093 P — — 1 0.724 W — — — 1 Average 0.1 17 CAS D M P W D 1 0.543 —0.039 0.583 M — 1 —0.026 0.298 P — — 1 0.309 W — — — 1 Average 0.278 1.2.3 Discussion The standard deviation of the fluctuations of st[A] and st[AL] computed across the 100 different waveforms of the ensemble decreased by more than 40% when the bandwidth was increased from 14 Hz to 108 Hz. Since these fluctuations were found to correlate with incoherence detection from Experiment 1, it was not surprising to find that there was less variation in the listeners’ ability to detect incoherence at the wider bandwidth. The bandwidth of 108 Hz may be of special interest because this bandwidth ap- proximately corresponds to a critical bandwidth at 500 Hz, and critical band noise has often been used in binaural experiments. For instance, Koehnke et al. (1986) used a noise with a bandwidth of 114 Hz centered on 500 Hz, and Evilsizer et al. (2002) used a bandwidth of 100 Hz. It was found that the tests comparing detection performance (CAS) with the size of the phase and level fluctuations led to a signifi- cant difference at the 0.05 level on six of eight of the t-tests. This can be compared to all of the tests being significant at the 0.02 level in Experiment 1. The PC results had 43 s,(A¢)=13.46 degrees 51(AL)=1.78 dB r v r I r 15 I . . . . v 4o 10- gee. A 5 L [D 8’ o 3 o 3 _I o—zo <1 -5- < 0 250 500 0 250 500 t (ms) t (ms) Figure 20: The IPD and ILD plotted as a function of time for an arbitrary 2396-Hz bandwidth noise-pair. Above the plot are the fluctuation values of the IPD and ILD. only two significant p—values at the 0.05 level compared to seven significant p—values in Experiment 1. It was expected that if the bandwidth were further increased the variation in fluctuations would continue to decrease and the incoherence detection performance would be approximately the same for all the waveforms in the ensem- ble. In that case, the ability to detect incoherence would be only a function of the incoherence measure itself. That expectation led to Experiment 3. 1.3 EXPERIMENT 3: WIDE BANDWIDTH 1.3.1 Method Experiment 3 was identical to Experiments 1 and 2 except that the bandwidth was increased to 2394 Hz. The geometric mean frequency of 500 Hz and the coherence of 0.9922 remained the same. Spectral components between 105 and 2495 Hz had unity amplitude and components in the ranges 100—105 and 2495—2500 Hz were shaped by the raised—cosine window. The variation of the interaural parameters can be seen in Figure 20. Figure 21 shows the distribution of interaural phase and level fluctuations for one- hundred 2394—Hz bandwidth noises. The corresponding values for Experiments 1 and 44 I U V I r U ‘ fi' 1 T I U ' I U U U l V r U V 15 20 25 30 35 40 45 I I (D a) .- 0) ' . 3, I . 2 «I f ‘ 1 ‘3 : : '3' I- -Z ,3 Z 2 J" L BW = 2394 Hz .1 o =_ u(s,[A]) = 13.5 55 _: "‘ : u(SI[AL]) = 1.76 ; ; a(sf[A]) = 1.70 : m :- o(s,[AI.]) = 0.11 1 ’ ' corr = 0.27 .1 o l I I L l l I l I l I 4 J I l 1 l I l I I A I l d 0 1 2 3 4 5 SI[AL] dB Figure 21: Fluctuations of IPD versus fluctuations of ILD for the collection of 100 re- producible noise-pairs having a 2394-Hz bandwidth, as used in Experiments 3. Each noise-pair is labeled by a serial number indicating only the order of creation. The means, standard deviations, and IPD-ILD correlation of the distributions are re- ported. The small black dots represent the 14—Hz bandwidth noise—pairs from the collection in Experiment 1. The large black dots represent the 108-Hz bandwidth noise-pairs from the collection in Experiment 2. 2 are shown by the small black dots. The mean of the distribution remained about the same as in Experiments 2 in that the ensemble average phase fluctuation was about 13 degrees and the level fluctuation was about 1.7 dB. However, the standard deviations of the phase and level fluctuations over the collection of 100 noise-pairs decreased dramatically when the bandwidth was increased to 2394 Hz, as shown by the a values in Figure 21, respectively 1.07 degrees and 0.11 dB. The correlation between level and phase fluctuations decreased to 0.27. 1.3.2 Results The results of the incoherence detection experiments, expressed as Pc and CAS values, are shown in Figures 22—25 for the phase and level sets. Listeners D and M, the 45 st[A] degrees 10.95 1 1.87 15.63 16.34 I l T I I I I I l I 100 I: O : .: p \ q C X : O Listener D I 90 1' \\ .0“ | O Listener M j I \ I’ ‘0 I '1 : b’ I, : 1 80 I'- I .. U t ‘\\ ‘\\ I p\ z o- P x‘ I‘.\ \ | ”‘ ’l \\ p ’ ‘4 1’ \ \ ‘\ 4 7° 7 “0' x \ l X’ ’7’ ‘e-x-a‘ , 1 : \b.-l.’d--.o’ \ ," I I- b v- ‘I \ I 'l 60 " I bl cl ' I I ' . . I ‘ 50 '- ' .. 100 _- I .. > I . : : I Listener P ' 90 :- i O Listener W ‘1‘ b I \ q I C) ’,’ 1‘ : ’ 1 80 " I u o I fr "9" l J” \ 1 \x T \ \ \ ‘ 0- . \\ D I, II \\ [Ad \\ p ’\ ‘ 70 :- \ \\ I I ‘ ‘ \ I \\ ” \ 1 P \ \ I I, I bs ’ [f \\ ' \\ \ I :“ ‘1“. \ ‘e 60 - \\ ”I \\ II | \ ’I \\ ’1’ - b \ I I I t \b,’ l 1‘ Id b 2 so 1 i I x .‘ I I I I I A’ I I I I 97 32 73 17 52 039 57 75 74 93 Noise-Pair Number Figure 22: The PC data for the 2394-Hz phase set. listeners with consistently the highest values of Pc in the previous experiments, are near the ceiling for only a few noise-pairs with a 2394-Hz bandwidth. With four listeners and two sets there were eight possible significance tests, and Table 8 shows that only one of them led to a significant difference at the 0.05 level for PC; none for CAS. Further, Table 9 shows inter-listener correlations comparable to those in Experiment 2. However, Table 10 shows an overall negative inter-listener correlation, mainly because of Listener W. 46 CAS CA8 70 65 60 55 50 45 40 35 3O 25 20 15- 10*- 70 65 60 55 50 45 4O 35 30 25 20 15- 10 Sf[A¢] degrees 10.95 1 1.87 15.63 16.34 I I I I I I I I I O Listener D O Listener M .0..- 9 s I ‘ ” I) _ _ ‘ 31* -o-’---0’ ~13: 9‘0 O Listener P O Listener W \ ,’D\ \ I \ ‘ ‘3 -J \\ u\>".‘ll ~‘s" \ \ I \ I V U __---_?:-----_____--_- P 6 7b 1 l I L l L I l l 97 32 73 17 52 29 57 75 74 93 Noise-Pair Number Figure 23: The CAS data for the 2394-Hz phase set. 47 100 90 80 70 60 50 100 90 80 70 60 50 s,[AL] dB 1.45 1.59 1.98 2.16 .- I I I I I I I I I I b I 1 b I d . . A . .. ! : [I \ E _1 ’ ’ \ I \\ ’ " : II \J I In \ ’ /D\ \\ ’l’ : -_ ,1 an. A. ,r .- *3” a - \II \ I ’I ~‘ ‘ F d I” \-A\ In 1 ‘ t I \ I ' : .— I \ I 1 _ : d \ i 1 I \ q . t1 1 I Listener D . f : C1 Listener M j C ' 1 I- I q n I d .' u '1 b I «I I : I Listener P I f : El Listener W 1 I I I . I q I- I a. - : ‘\ I"\ ' ’f’ ‘ 1- D“:\ I, \ : ’5‘ I : p \ I ‘7‘ I a L- qt“ ”‘0\ ’ Iyfl fix I, .1 . \\ ‘D’ [NU/l“ )‘TJ . D ’ L \\\ [I] I .\ ” \\\ j . x I \ : : 5.--.- 1 \x “I U 1 D ' " ‘ - I ., I I I J I I I I I I 28 12 73 62 3 49 15 38 29 20 Noise-Pair Number Figure 24: The Pc data for the 2394-Hz level set. 48 CAS CA5 70 65 60 55 50 45 4O 35 30 25 20 15 10 70 65 60 55 50 45 40 35 30- 25 20 15 10 $f[AI-] dB 1.45 1.59 1.98 2.16 I I T I T T I I I I I ‘\ \ -‘I\ -4. ‘ I \ I \ ---I----I*"' N- , lit-"‘0: EVD-HdHQ‘ 1,1 I \ ‘5 , . , ~13 I \ I, [3’ U I” I I I I Listener D D Listener M P .1 I I Listener P D Listener W 2512 73 52 3 4915 35 29 Noise-Pair Number Figure 25: The CAS data for the 2394-Hz level set. 49 Table 8: The p—values from a one—tailed t-test for 108-Hz bandwidth data. PC Phase Level CAS Phase Level Listener Set Set Listener Set Set D 0.772 0.230 D 0.907 0.375 M 0.981 0.049 M 0.981 0.080 P 0.683 0.633 P 0.683 0.654 W 0.964 0.753 W 0.947 0.866 Table 9: Inter-listener correlations for 2394-Hz bandwidth phase set. PC D M P W D 1 0.585 0.161 0.103 M — 1 0.165 0.363 P — — 1 —0.125 W — — — 1 Average 0.208 CAS' D M P W D 1 0.572 .—0.059 0.426 M — 1 0.165 0.377 P — — 1 0.031 W — — — 1 Average 0.252 1.3.3 Discussion Experiments 1, 2, and 3 clearly show trends in the nature of interaural incoherence de- tection. For narrow bands, detection depends on the details of interaural fluctuations, and it is not possible to predict detection performance if one knows only the value of coherence. As the bandwidth increases, the coherence becomes a better predictor of detection performance. The wideband limit, where the coherence statistic itself becomes adequate to predict performance for all stimuli, was apparently reached for the bandwidth of 2394 Hz because no systematic differences appeared between those 50 Table 10: Inter-listener correlations for 2394-Hz bandwidth level set. PC D M P W D 1 0.364 0.024 —0.153 M - 1 0.042 —O.439 P — — 1 -0.474 W — — — 1 Average ——0. 106 CAS D M P W D 1 0.253 0.136 0.130 M — 1 0.043 -0.683 P — — 1 —O.488 W - - — 1 Average -0.105 noise-pairs with the largest fluctuations and those noise-pairs with the smallest. The above interpretation, however, is not flawless. The noise-pairs for Experiment 3 were chosen on the basis of fluctuations computed for the entire band of noise, as presented to the listeners. However, it is possible that listeners do not perform the incoherence detection experiment by listening to the entire band. Instead, they may listen to a narrower portion of the band - perhaps one critical band width wide. If so, then the choices of noise—pairs were inappropriate for the band actually used, and further, one would have no way of knowing a priori which band to examine when selecting stimuli based on large and small fluctuations. If this were true, then the reasoning by which it was concluded that the wideband limit is reached in Experiment 3 would be circular reasoning. One piece of evidence in favor of this view is found in the MLD experiments by van de Par and Kohlrausch (1999) which suggested that listeners detect an S7r sine in noise by listening to a critical band around the tone. It is believed that the above criticism is incorrect and that in a wide band in- coherence detection task - perhaps unlike a wideband MLD task - listeners do not attend to only a portion of the band. The evidence for that belief is that the listeners 51 performed much less well for the wide band in Experiment 3 compared to the 108-Hz band of Experiment 2. Comparing CA5 values in Figures 17 and 23 (phase sets) and comparing values in Figures 19 and 25 (level sets) show that the most successful listeners, D and M, had higher scores even for the most difficult (lowest values of PC) noise-pairs at 108 Hz than for any noise-pairs with wide bands. Listeners P and W also scored consistently better at 108-Hz for the level sets. If listeners were able to take advantage of the slower and potentially larger fluctuations in a critical-band portion of the wide hand one would have expected that some listener would have scored well for some one of the noise-pairs, contrary to the results of Experiment 3. Consequently it is believed that Experiment 3 reached a wideband limit wherein the relevant interaural fluctuations are completely characterized by the coherence value. 1.4 EXPERIMENT 4: THE ROLE OF BANDWIDTH Experiments 1, 2, and 3 demonstrated that, as the bandwidth increases, two effects occur. First, the ranges of fluctuations in IPD and ILD become narrower. Second, the ability of listeners to detect incoherence depends less on the individual noise- pairs and is better determined by the value of coherence itself. According to the hypothesis of this chapter, the second effect is the direct result of the first, and the main effect of a variation in bandwidth is to alter the distributions of interaural variances. Experiment 4 was designed to test this idea. 1.4.1 Method To test the hypothesis, a subset of noise-pairs was assembled from a new collection of 1000 more pairs with a bandwidth of 14 Hz to make a “matched set” whose members were selected to best match the fluctuations in the noise-pairs from Experiment 2, which had a bandwidth of 108 Hz. Experiment 2 included 20 noise-pairs, ten for the phase set and ten for the level set, as determined by the five largest and five smallest 52 m U'I'V'VTV'T—IUII'V'TIV'I'I'U'T1I'I'V'I'U'I'UT'IV'V N ‘ . ‘ ' " . .1, ‘- - o ‘ , .. I— 'd - ' 9%- 4 s- ~ 1 - 3 L ' . .. vm_ ~. . - ..-—. . o - - '0' . m I— 3- ®. I - (we . Ln- ', ,. . . _, All].lnLjJnlnlLlnlnlnlnlnlnlnlnInlnlnlnlnlnlnlLLL 0.5 l 1.5 2 2.5 3 Sf[AL] dB Figure 26: The matched set for Experiment 4. The 17 noise-pairs from phase set 1 and level set 1 in Experiment 2 were matched in fluctuations of IPD and fluctuations of ILD by the closest noise-pairs from Experiment 1, which are shown as black dots. fluctuations. However, phase and level fluctuations tend to be correlated and five of the noise-pairs were common to the phase and level sets. Therefore, there were only 15 different noise-pairs in Experiment 2. For each of these 108-Hz bandwidth noise- pairs a 14-Hz bandwidth noise-pair was selected that best matched the fluctuations in phase and level. The selection is illustrated by the 15 open and filled circles in Figure 26. Phase and level sets using the matched noise-pairs formed the stimuli for Experiment 4, which was otherwise identical to the other experiments of this chapter. 1.4.2 Results According to the working hypothesis, the detection scores for the matched 14-Hz sets from Experiment 4 ought to be identical to the detection scores in the 108-Hz sets from Experiment 2. The results of the comparison are shown in Figures 27—30 for the phase and level sets, and for the PC and CAS values. The phase set data in 53 Figures 27 and 28 show that the values of PC and CAS are comparable for the two bandwidths when the IPD fluctuations are matched for Listeners D and M. However, for Listeners P and W performance is better for the 14-Hz bandwidth than for the 108-Hz bandwidth. The level set data in Figures 29 and 30 show comparable PC and CAS values for the two bandwidths for all the listeners, though Listeners P and W still tend to show better performance at the smaller bandwidth - better on 15 of 20 possible comparisons for PC and 14 of 20 possible comparisons for CAS. Therefore, the raw data, shown in Figures 27—30, offer modest support for the hypothesis that bandwidth should be unimportant if the sizes of fluctuations are matched. The hypothesis can be further tested by examining the relative detectability of the noise-pairs with the largest interaural fluctuations vs. the noise—pairs with the smallest interaural fluctuations. These are, respectively, to the right and to the left of the vertical dashed line in Figures 27-30. A t-test of the hypothesis that PC and CAS scores are higher for the five noise-pairs on the right led to the p-values in Table 11. There, it can be seen that two of eight p-values are significant at the 0.05 level for the 14-Hz matched sets PC values, and that two of eight p-values are significant at that level for the targeted 108-Hz sets. It can also be seen that three of eight p—values are significant at the 0.05 level for the 14-Hz matched sets CAS values, and that six of eight p-values are significant at that level for the targeted 108-Hz sets. By comparison, seven of eight p—values were significant at the 0.05 level for the 14-Hz bandwidth PC data and all eight p-values were significant at the 0.02 level for the 14- Hz bandwidth CAS data in Experiment 1. Only one of the p-values were significant even at the 0.05 level for the wide bandwidth Pc data and none of the p-values were significant even at the 0.05 level for the wide bandwidth CAS data in Experiment 3. Thus, the matched phase and level sets appear to have approximately matched the 54 relative difference in performance between the noise-pairs in the Experiment 2 sets, consistent with the hypothesis that the size of interaural fluctuations determines the detection of incoherence. However, there are differences between the results of Experiments 2 and 4. As noted above, there is a tendency for listeners P and W to score better on the 14-Hz bandwidth sets (Experiment 4). More impressive, a comparison of the inter-listener correlations in Tables 12 and 13 show that all twelve of the correlations are higher for the 14—Hz matched sets than for the correlations in Tables 7 and 9 for the 108- Hz bandwidth sets. Averaged over listener pairs and over phase and level sets, the inter-listener correlation was 0.601 for the 14-Hz bandwidth and only 0.171 for the 108-Hz bandwidth for the Pc results. Likewise, the averaged inter-listener correlation was 0.700 for the 14-Hz bandwidth and only 0.329 for the 108-Hz bandwidth for the CAS results. Table 11: The p—values from a one-tailed t-test for matched phase and level data. PC Phase Level CAS Phase Level Listener Set Set Listener Set Set D 0.428 0.111 D 0.103 0.077 M 0.408 0.421 M 0.384 0.315 P 0.025 0.050 P 0.006 0.038 W 0.163 0.308 W 0.042 0.117 1.4.3 Discussion Experiment 4 attempted to construct a set of noise-pairs with a 14—Hz bandwidth that would lead to the same patterns of detection performance that had been seen in Experiment 2, which used noise-pairs with a 108-Hz bandwidth. This was done by best matching the values of st[A] and st[AL] of the Experiment 2 noise-pairs 55 24.26 s e 8‘ naus. em .0 Mm a Im .ol 3 8.2 1 ‘III..“‘I““I‘1I‘I‘I“4. .9 = g .I a ow... 4.I W H .I 08 I 904 .o‘ +11 -\ .6 I LO. _‘ mI I I . I I I o e . \ .Iir 1111111111111 I \ II ’1 II II -DDDDDDIDbrpbbhhbblprbP- .I“..‘41““"d“dqiq“d I. Mum... .II Mu H .I ”8 g 904 $11 U0. FDDD-PPFI-DPDthbDbhfibbF- .‘:‘-““.“‘i-Id1‘-“‘1. .. o s I c I \ I 4 U P2 . IIer II I M8 v Vewm s \ \UO. s s x I o a — I . I . r o - \ eon -PPPP-bbDhbh’IhbDDIDPDDDD. “““‘d“-“1‘dili‘q‘Q‘liq a \o 1 n _ \\ fl, Wuvn. z 1 II I; W H I [I U8 V .0 1mm” 1 ~ \ .6 ~ \ LO. I X I .a . ~ . s D-DthhhhiD-FDDPDPPDP- 76 32 4O 9 73 64 54 10 93 49 000000 098765 on. 0 0 0 o 00 0 0 0 0 p 0 0 0 9 8 7 6 5 0 9 a 7 6 5 l I on. FF: 0 w 0 0 0 0 0 8 7 6 5 0n. Noise-Pair Number Figure 27: Matched phase set Pc data. 56 2426 2L5! s+[A] degrees 5.50 621 54 54 73 9 4o 32 75 Noise-Pair Number 10 93 49 qua-«qud-q1- q-dqfidqq-u dqqquddfidddqq-udddqquqJ . 3 o n I. o v n 1 —- o \ \ § \ \ I. Ixx xx x x xx I. Ix xx IIPZ xx x 2 fi A! Dz 3 M2 3 0 «HH A 0 W2 1 II HZ II HZ I I 8 I I HZ II I I e I I enU14 I I e “H II D8 I II 08 I I111. I I 08 804 I 904 I .5 I I 804 ._ .. uoo I uoo I .. no. I. I I I x x I x x a 0+ .5 a o e o 1 I. xI I I I . x— \ a x . x I I. X I I . I I - A: a o .I o I, o 1 I. I I II I II II 1111I+ 1111111111 1+1111I 111111111111111 J 11111 7111 114 11111 I 1111111 I. I I I I I I r I I I - o , o a o p o 1 ./ le I I II I II xI I I I I .1 x]! I I I I I \ [I I I I I . 5 I. p u I u I. o 1 I _ II \\ x \\ x I I . /\\ \ \ \ x I. \II I x x . I x I Ix x 1 n. I. o B I. o 1 xx I I x I x I 5x I I I I I _ x x I I I II II _ . n. I .5 I o I o 1 I. x \ I x x I I I \\ I I I I .I xx \\ I x xx I 1 G a Q o o o o 1 bh-b-P-b-b—P-ppnpnpnpppP PCP-nir—ppnpppFrb-pp—E 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o 0 0 0 0 0 0 0 0 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 Figure 28: Matched phase set CAS data. 57 256 250 SJAL] degrees 1J5 -““.“"-IC““II‘-““d ‘.“‘_“‘I-‘I“-‘I“-II‘C‘1 -“I‘-“-‘.‘4““I‘q1‘4‘1q ““‘.““-II%“‘-d“‘d I o o 1» to .I o 1 I I III \ x I I I I I \ \ I x I. II xx \ I I 13 D 2 iv M 2 4x 0 P z % n W z 1 : PH2 2 PH2 I I PHZ I x PHZ II 8 H” I. e H” I I B H” I xx 6 Mn II Aw : Av I I nIO I x 0.5 m04 @041 e04 804 —~ IS IS — \ CB . \ om . Lo. Lo. _ 1 Lo. .1 Lo. I I I I \ 1x I n In 1 = II I x II I I I I x II = I I Ix II I I x II 13 o .- gr, 3 1 I/ I \ I III x I \ I 111In 11111111111 tn 11111111111111 1111I11111nu1111 1%11r 11111111111 III Ix I II I I 1 a; b a a I v 1 x I II I xx I I x I II I \ I x I \ I x I II I x 1 x I I \ 1 o b D i 9 1 \ A \ I I \\ \ I I \ \ \ § \ \ \ I \ I I \ xx \ I x I \\ \x \\ I \ I \ 1 w 3 p, u .I a 1 I II x I I III II II xx I I I I II I x I I f ,0 E ,u I a I I t x I I I I I x \\ I I I I II II xx ” I I t 0 0t 1 o o o 1 lib-b.P’b-FDID-b’L._D.DD- _Dh...IDII-.”D-D-.Drtbb ..1P'P~’.b.-bpbb-DDDD-P.Ilb D’...-.b_-_PFDD-DIID_.F’D- 0 o 0 o 0 o O 0 o o o 0 o 0 0 O O 0 O 0 0 0 0 0 0 9 8 7 6 5 o 9 8 7 6 5 0 9 8 7 6 5 o 9 8 7. 6 5 on. Un_ Un_ Ml 47 100 92 80 32 Noise-Pair Number 19 58 5 79 81 Figure 29: Matched level set PC data. 61 2.66 2.50 1.15 sf[AL] degrees 0.91 .-441q-q4-...q..uqq.Jqu.qqqqq.-4-.qqqu-.14aqq.. ,. go .o e o a 9, J I I I. s . I I; I I I. X . I z . . >I \ . I I” r :0 Dz i Mz Pz A 0 t D 1 I. PHZ .. PHZ PHZI _ I x I. e H .. e H e H I . . x \ I. manwd. .. mm... ma004 II . . \ , e 9.; i ......... as. Z I n w,.. o- 0' cl \ z 3. Lo. ,, I Lo. Lo. .. e. ,, \ wHH I. I II . e X %%4 r a o c a ea own, a}... ’ 0' I.- .— II I,“ II II LO. I. . I : I I I I I .r 9 r a, at - 1’ I; I I, st IIIIIII «IIIIIIII IIIIIIIT+IIIIIII Illsllllll1nrllll IIIIIIINIIIIIIII 01/ III I 1.. xx II . 6 g o v p o - s/ - . x II ~ \ II 2 . \ Is s s x I \ II . .s L1 I 0 \O u‘ f ‘0. n e \\ s\ \I \A \ \\\ \é \ I x I Ox \\\\ \ II \\\ I - n a o g o a o 1 .I I I I e . _ . I I I I ~ . _ — a I I a s . _ I I I Is - I. o e \b 3 v o . . I e x x. x . . I ~xs \. \ . . I \\\ xx . \ _ 1 O O Q C O Q 0 1 t.£b-...-b.-..._..............p_b............F 0 0 0 0 0 0 0 0 0 0 0 0 O 0 0 0 O 0 0 0 0 0 0 0 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 m] was 0.43. Since the level difference is calculated from the envelope, the first correlation seems intu- itive. The latter correlation would be hard to understand were it not for the strong correlation between the standard deviations of interaural phase and level differences, as shown in Figs. 6, 15, and 21. Next, the same calculations were made for the 20 noise—pairs actually used in Ex- periment 1. The interaural phase fluctuations and the interaural level fluctuations correlated with the monaural envelope fluctuations at levels of 0.59 and 0.65 respec- tively. Evidently, those waveforms with interaural fluctuations that are especially large or small particularly owe their binaural character to the envelope of the original generating noise token. Given the positive correlation between interaural and monaural fluctuations, it 62 seemed possible that there might be information in the monaural signals that was used by listeners in performing the experiments of this chapter. Because of the evident importance of fluctuations in Experiment 1, attention centered on the stimuli used there, with a bandwidth of 14 Hz. 1.5.1 Diotic experiment method Experiment 5 was identical to Experiment 1 except for the important difference that the left-ear signal of Experiment 1 was the signal for both ears in Experiment 5. In a second difference, the listeners in Experiment 5 had all completed Experiments 1 through 4 and therefore were highly experienced. The listeners were given three- interval sequences as before and were asked to apply the same strategy that they had used in the previous experiments. There was reason to believe that this approach might be successful because all the listeners volunteered that in the previous exper- iments they based their decisions on a sense of width, choosing the interval - either two or three - with the larger width. It seemed possible that the sense of roughness or other “action” associated with a diotic stimulus having large fluctuations could be associated with a sense of width. Consequently, it was expected that each trial of Experiment 5 would constitute a comparison between apparent “widths” for a par- ticular identified noise and a different, randomly-chosen, noise from the set of ten. The stimulus sets were the phase set and level set from Experiment 1. 1.5.2 Results Listeners made a negligible number of “confident” responses. Apparently the strong sensation of width elicited by some of the dichotic stimuli did not occur with any of the diotic noises. Therefore, the CAS had little value and results of Experiment 5 were plotted only in terms of the percentage of the trials on which a given noise was selected over other noises. The statistic will be called P,, percent selected. It can be 63 too.- ax ' ‘ ,,4---;,.-r—--4-\ . ,,.4---4 1 E “0---,x , v 3 75: : 1 ’ l D.“ o- I b I 1’ ‘ l U h I ‘-— 1 m 50" '0—'_ _-_ I . '0 .. 0. O-----O---_ ,---o'" T O '0 Listener D 30‘ 25'- 0 : O Diotic q. : : O Dichofic : I : 905ch = 0.241 I 0‘ 1 1 1 1 1— 1 1 1 1 1 ‘ wo- ' ' 7 A- ' ,s---;__-x---.---; - ’ g I 4‘ ,1. ‘ : ‘s‘ -_,..” ‘I : : ts: * g ,9 .« : l ’I’O“~-o” \ 1 0.0 50:. LID, \\\ 30-“ E ,on ,xD‘“‘0’ : Listener M x 3 25’. O” “O" l O Dfofic . b .1 : i O Dichofic : i . paw. = 0.530 : °" : t t i i ' . 4 . . ‘ 100: ' l I L‘ I 1 b ’1 \‘ --- ’1' “ ’ ‘ 75'- a” \‘ ’1’. ‘ : “ 1 U \\ ’,B:“ ,O'T—‘Q‘u‘ ’0‘“... E m 50.. ‘ I’ ‘13—" ' ~ I,’ ‘~o - 0' U l Listener F“~o :0- 25L : O Dio’ric ‘: I : O Dichotic : i : mom. = -o.os7 : 0" 1 1 1 1 1 1 1 1 1 1 " mo.- T T ' -3. ' .,;-—-+---4---§--—4 , E t’ ‘s‘ ’1? Q I, \‘ : 75- x ‘0 i x x ,' i .: : 1” l ,” \s I X : CL0 50:. 0"". :ID \0’ \ j m o‘\ ,ID‘\ x)" Listener W \\ 10' 25’- ‘~ I’ <3 : o biotic . -. : 0""7 : o Dichotic b ; i . . . = 0.441 : Ol- l l l L l . l l moron L i 7‘ 20 84 4o 99 as t 44 57 34 ll Noise-Pair Number Figure 31: Comparison between Pc and P, data for the phase set of Experiment 1 and the same noise-pairs with just the left channel presented diotically. compared with PC, the percent correct in the dichotic experiments. 1. Large fluctuation comparison Particular interest centered on the five noise-pairs for which the interaural fluctuations were the greatest. By examining the scores from the diotic experiment it was expected to gain insight into the role that envelope fluctuations may have played in the dichotic experiment. The average values of P, for those five noise-pairs for listeners D, M, P, and W were, respectively: 64 100 75 0350 25 100 75 25 l 00 75 (L50 25 l 00 75 (150 25 ,_ r I I T I _‘ ___ _-_ ____ _ : I----I--- ,I—"T""' * .7 i ' : : r”’ ~* I D‘ : :- E ’1’ “‘fl .1 : 13-1 1’ “ti-“.0 - .. , ‘13 . J . ’,ID~~--fl_~ x’ E Listener D : E 0’ “a . Cl biotic ; f i I Dichotic 1 E i pDioDic = 0324 a _ I u l l l l l _. . ,1 . ,3 . . I---4---4---4---4—, P I” \ I ‘ i I ‘ b . \ I \ " 4 I.- \\ I, \\ A] : : \\',’ \\'//: D‘-‘ ’I’D‘\ : . ' ’I sfl, \\ ‘ .- I _ U .I : tin—{1 l I’ Listener M : : \ ’,D‘---G-T--Cf U Diotic ; E- ‘13” i I Dichotic 1 : : Pbio.bic = 0-555 3 ' i : i t : ' t 4. ' . t ‘ ” ——— "- --- 1 : E/II' " ‘I----I 1 L I”. --- ,4: : : ”’7. I” l ’13 fl“~ ”or" ' c x 5’ X, D- 3 :' Dunc] \ ’4: ‘x‘ ”l Listener P ‘- : 7‘:-]’” U I Cl Diotic : r l I Dichotic 1 E } Pbiobic = 0-314 3 D ' d l l 1 I l 1 I I _ I T T l I ___ ___ ____ ___ _l : VII, 1 g," " * i i : E. \‘\ ,” “\ I", ’D’"-O“‘fl‘ 5 : \" \‘1 : I, \‘U : D I I ‘ D I q r 121 13‘ LU Listener W 1 : I’ \\‘ ll \‘s ,’i D DIOlIC i r C!” \ ,’ U i I Dichotic 1 I \ - 1 t 13’ : PDioDic - 0'800 j l‘ 1 1 1 1 1 ' 1 1 1 1 1 "‘ 4o 95 97 58 86 80 32 44 69 57 Noise-Pair Number 5 P Figure 32: Comparison between Pc and P, data for the level set of Experiment 1 and the same noise-pairs with just the left channel presented diotically. 65 For the phase set: 54, 61, 54, 63 ‘70 For the level set: 63, 58, 72, 70 ‘70. Evidently, in the diotic experiment listeners chose the noises that had led to the largest interaural fluctuations clearly more than half the time. These numbers can be compared with the values of P, in the dichotic experiment (Experiment 1) which averaged 88%. 2. Agreement between listeners The agreement among the listeners was assessed by comparing values of P, against noise serial number for listeners taken in pairs. Inter-listener correlations for D-M, D—P, D-W, M-P, M-W, and P-W were as follows: For the phase set: 0.59, -0.38, 0.38, —0.22, 0.88, 0.17 For the level set: 0.70, 0.53, 0.76, 0.67, 0.89, 0.76. The strongest correlation was between M and W. Listener P was responsible for the only negative correlations, both in the phase set. Correlations were clearly larger in the level set than in the phase set. The strong correlation indicates that listeners tended to agree about which fluctuations were salient. 3. Comparison with enve10pe fluctuation A comparison between the listener selection of noises and fluctuation was assessed by comparing P, with s,[E] as a function of the noise-pair serial number. Correlations for listeners D, M, P, and W were, respectively: For the phase set: 0.38, 0.84, —0.03, 0.88 For the level set: 0.85, 0.66, 0.65, 0.70. Again, P is responsible for the only negative correlation. The positive correlation indicates that the choices that listeners make can be predicted based on the physical envelope fluctuations, as measured by the standard deviation of the envelope over time, especially for the level set. 66 4. Comparison with Experiment 1 A comparison between the results of the corresponding diotic and dichotic experi- ments was made by comparing P, on Experiment 5 with P, on Experiment 1, both as functions of the noise-pair serial number. The data can be seen in Figures 31 and 32. Correlations for listeners D, M, P, and W were, respectively: For the phase set: 0.24, 0.53, —-0.07, 0.44 For the level set: 0.82, 0.66, 0.81, 0.80. Again, correlations are larger for the level set. 1 .5.3 Discussion The correlations above are fairly impressive for being so large with possibly some exceptions for Listener P. These include the correlations between P, in Experiment 5 and P, in Experiment 1 as well as the correlations between P, and monaural and dichotic fluctuations. There are several possible interpretations of these correlations. Possibly the correlation between P, and P, scores only represents a chain of stimu- lus circumstances. For a narrow bandwidth like 14 Hz, every kind of stimulus fluctua- tion seems to correlate with every other kind. Interaural phase fluctuations correlate with interaural level fluctuations and both correlate with noise envelope fluctuations. In a dichotic experiment probing the detection of interaural incoherence listeners at- tend to the interaural fluctuations. In a diotic experiment probing an evaluation of roughness or other stimulus action listeners attend to the envelope fluctuations. The results of the two experiments, as functions of the stimulus serial number, are similar because the interaural and monaural fluctuations behave similarly with respect to serial number. Alternatively, it is possible that the correlation between P, and P, scores arises be- cause listeners in a dichotic experiment are misled by monaural envelope fluctuations that are particularly large or particularly small. Given the enormous difference in 67 the average P, and P, values for the five noise-pairs with large fluctuations, as noted in the results section above, it seems highly unlikely that monaural fluctuations per se contribute to listener judgements in the binaural experiment when the detection of interaural incoherence is easy. But when detection of interaural incoherence is difficult, or impossible, the cues from monaural envelope fluctuations (or the lack of them) may influence judgements and are probably responsible for P, values in a dichotic experiment that are less than chance. 1.6 DISCUSSION AND CONCLUSION Listeners are sensitive to small amounts of interaural coherence in noise. Given diotic noise as a comparison, listeners can detect a coherence change of 0.01, i.e. they are sensitive to the difference between 1.00 and 0.99 (Gabriel and Colburn, 1981). The goal of this chapter was to understand the origins of this remarkable sensitivity. 1.6.1 Detection of incoherence Experiment 1 selected stimuli from an ensemble of 100 reproducible left-right noise- pairs, all of which had a bandwidth of 14 Hz and an interaural coherence of 0.9922. It was found that those pairs that had a large fluctuation in interaural phase diflerence (IPD) or large fluctuation in interaural level difference (ILD) were much more readily recognized as not perfectly coherent compared to pairs with small fluctuations. This result led to the conclusion that, for bandwidths as narrow as 14 Hz, the interaural coherence is not an adequate predictor of the ability to detect incoherence. Instead, the size of the interaural fluctuations matters. Experiments 2 and 3 progressively increased the bandwidth and found that the ranges of fluctuations in IPD and ILD among different noise-pairs in an ensemble decreased with increasing bandwidth. (See Appendix B). This observation led to the expectation that the detectability of incoherence would exhibit less variation for 68 different noise-pairs with these wider bandwidths. Detection experiments similar to Experiment 1 showed an increasing uniformity in detectability for the incoherence in noises with increasing bandwidths, as expected. It was conjectured that the only reason that detection performance for 14-Hz band- width was different from performance for 108-Hz bandwidth was that more extreme values of fluctuations (both very small and very large) were available in the ensemble with the narrow bandwidth. In Experiment 4 a comparison was made between per- formance on noise-pairs with 108-Hz bandwidth and performance on a matched set of noise-pairs with 14-Hz bandwidth. Noise-pairs in the matched set were selected to have approximately the same interaural fluctuations as the pairs of the 108-Hz set. The comparison showed that detectability differences among diflerent noise-pairs in the matched set were reduced to about the same level as for the 108-Hz noises, consistent with the conjecture. The overall performance on the 14—Hz matched set was approximately equal on the 108-Hz set for two of the listeners; it was consistently higher for the other two listeners. These two results from the comparison suggest that differences in interaural fluctuations are responsible for differences in the detectability of incoherence for different noises, but that fluctuations of comparable size are more easily detected when the bandwidth is narrow. The most likely explanation for the advantage of narrow bands is that fluctuations are slower. The role of fluctuation speed will be addressed in the chapter on binaural modeling. These conclusions differ from those of Breebaart and Kohlrausch (2001) who dis- missed a specific role for IPD and ILD fluctuations in binaural detection because the distributions of those fluctuations, with or without a signal, failed to show a band- width dependence. By contrast their NpS7r detection experiments did show such a dependence. It is agreed that the s,[A] and s,[AL] have mean values, averaged across waveforms, that are insensitive to bandwidth, as n