WWWHHIHUWIWHNIUIWNW!!!"WWW!

      
   

_\_|
(04>

1
7
HS

      

_.

  

lﬂlillllllzlﬂllllllﬂllﬂmﬂjlljllllﬂﬂllﬂﬂll

 

This is to certify that the
thesis entitled

Perception of Frequency Modulation
width at Low Modulation Frequencies

presented by

Mark Allen Klein

 

 

 

 

 

 

 

 

has been accepted towards fulfillment
of the requirements for

M.A . Psychology

degree in

 

 

C047 $447M .

Major professor

Date Hint er 1 980

0-7639

PERCEPTION OF FREQUENCY
MODULATION WIDTH AT
LOW MODULATION

FREQUENCIES

BY

Mark Allen Klein

A THESIS
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

MASTER OF ARTS

Department of Psychology

1980

ABSTRACT

PERCEPTION OF FREQUENCY MODULATION WIDTH
AT LOW MODULATION FREQUENCIES

BY

Mark Allen Klein

Previous studies have shown that the perceived width of frequency
modulation depends not only on the frequency excursion, :_ f, but
also on the modulation rate and waveform. The results of this earlier
work failed to support a single model of modulation width perception.
In this study a series of experiments was carried out based on a
complex modulation waveform having only two spectral components.
Parametric manipulations involved the overall frequency excursion, the
phase relationship between the two components and the modulation rate.
Both modulation width perception and modulation waveform discrimination
showed dependence on these parameters. The class of models for width
perception involving simple detection of spectral components of the
modulation waveform, ignoring phase relationships, was rejected.
Future models must more closely examine the temporal microstructure

of the modulation waveform.

ACKNOWLEDGMENTS

I would like to express my appreciation to Dr. James Zacks,
Dr. Gordon ﬂood, and Dr. David Wessel for their help in serving as
members of my thesis committee. I am also very grateful to Dr.
William Hartmann, my major advisor, whose guidance, assistance, and
inspiration made this work possible. My special thanks to my wife,
Diane, for her help in preparing this manuscript and understanding

support throughout the entire duration of this project.

ii

TABLE OF CONTENTS

LIST OF TABLES O O O O O I I O O O O O O O O O O O O O O O C O O O O 01v

LIST OF FIGURES. e o o e o e o e e e e e o e o e o e e e e e e e a o e V

Views Of PerCEDtion. o e e o e o e e e e e e o a e e o e o e e o e e e a
MOdels e o o o e e e e e e e e e e o e e e e e e e e e e e e e e e e e 7

Experiment I o e o e o o o o e e e e e e o o e o o e e e o e o e a o 011
MCChOd o e e e e e e o e o e e e e 013
Resu1t8‘DiSCUBSion e e e e e e e o e o e .18

EXPeriment II. o e e e o e o e e e e e e e a o o e e e o e e e e o e 035
MCthOd e e e o o e e e e e e o e 0 e36
Results-Discussion . . . . . . . . . . . .37

EXperiment III . . .38

MchOd a o e o e a e e e a e e e e 039
Resu1t8 o e o e o o o o e e e e e c 040
DiSCUSSiOn o o e e e e o o e o e e e 040

EXPeriment IV. 0 e e e o e o o o e e e o o e o a e e e e e e e e a e 044
Meth O O O O I O O O O O O O O O .49
ReSUItS-DISCUS510n a e o o e e o o o e e 049

General Discussion and Conclusions . . . . . . . . . . . . . . . . . .53

LIST OF REFERENCES 0 O O O O O O O O O O O O O O O O O 0 O 0 O O O O .56

iii

Table

Table

Table

Table

Table

1.

2.

3.

5.

LIST OF TABLES

REdiCted and Observed Re-V31UGSo o e e e o a e o e e e o

95! confidence intervals for differences between pooled
data for é a "/4, 71/2, 8. 3N4 versus pooled data
f0r¢.Sﬂ/4,3"/29&7ﬁlaeeeeeeeeeeeeee

952 confidence intervals for differences between
Qe-Values across levels Of Afsoe e e e e e o e e e 0

FM detection levels in Hertz for 4 and 12 Hz modulation

rates........................

952 confidence intervals for differences in Qe -values for
¢<ﬁver3us¢>ﬁoeoeeeeeeeeeeeeeeeee

iv

. 9

.31

.32

.42

.51

Figure
Figure
Figure

Figure

Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure

Figure

1.
2.

3.

LIST OF FIGURES

Examplewavefoms.....................5

comlexwavefomeoeeeeoeeoeeeeeeeeoe

h131t1m1ngeooeeeeeeeeeoeoeeeoeoe

R -values.

Group data Experiment I.
following figures 0 represents Af 8 2.5 Hz

For this and all

A represents Afsa 5 Hz and o repgesents (ifs-10 Hz

Re-values.
Re-values.
Re-values.
Re-values.
Qe-values.
Qe-values.
Qe-values.
Qe-values.

Qe-values.

Subject B Experiment I . . . . .
Subject M Experiment I . . . . .
Subject W Experiment I . . . . .
Auxillary subjects Experiment I.

Group data Experiment I and II .

Subject B EXperiment I and II .
Subject M Experiment I and II .
Subject N Experiment I and II .

Auxillary subjects Emeriment I. . .

Amplitude of 12 Hz component relative to thresholds.

Exponential windows superimposed on modulation waveforms

Qe-values.

ExperimentIVdata.............

.12

.17

.19
.20
.21
.22
.23
.25
.26
.27
.28
.29

.41

.SO

Vibrato is a common musical ornament used in singing and in
playing most musical instruments. The basic mechanism of vibrato is a
slow, periodic modulation of the frequency being produced about some
central frequency. Vibrato fulfills various roles for the performer.
Kock (1936) found vibrato effective for disguising incorrect intonation
in the component tones of chords. Sundberg (1978), however, found
that vibrato does not disguise incorrect intonation for single tones.
An earlier Sundberg paper (1977) found that vibrato does conceal
deviations in vowel qualities typical in singing. The artistic role
of vibrato includes, for example, variation of emotional expression
for single notes or phrases. The performer may vary the depth and/or
rate of vibrato during a single note or phrase to achieve a desired
effect. When specifically notated, vibrato may have any other meaning
or effect depending on the abilities of the composer/arranger and the
performer.

Two entire volumes (C.E. Seashore, 1932, 1935) were devoted to
discovering the depth and rate for the most pleasing and beautiful
vibrato. Techniques for production of this ideal vibrato were analyzed
and methods for teaching those techniques to other musicians were
described. Most of those studies, however, offered no information on
how vibrato is perceived; how frequency modulation is processed by the
auditory system.

As with phenomena in other complex systems, vibrato in musical
instruments and the human voice is much more than a single, simple
effect i.e., frequency modulation (FM). There are unavoidable changes
in the amplitude and/or timbre of the tone, too. These effects may be
due to the technique for production of the vibrato or the acoustical

properties of the instrument or vocal tract. Despite these other changes,
1

2
the major effect perceived is the modulation of the pitch. Traditions

ally vibrato is frequency modulation. In this study, vibrato will be
represented by the process of frequency modulation of a sinewave
carrier signal.

Vibrato and frequency modulation are not isolated musical phen-
omena. They are part of the more general problem of the perception of
pitch and pitch changes over time. Zwicker (1973) has argued that the
perception of frequency modulation may encompass all the elements
of time-varying pitch perception. The auditory processes involved in
time-varying pitch perception may also extend to the perception of other
musical ornaments, such as trills (e.g., Bowling, 1973), or even the
basic phenomena of melody perception (e.g., Deutsch, 1972). In speech
perception, pitch levels and expecially transitions between pitch
levels are important for carrying information and characterizing
phonemes. The typical model system for studying these effects has
been the simple frequency glide as studied by Nabelek, Nabelek, and
Hirsh (1973).

Lewis, Cowan, and Fairbanks (1940) asked subjects to indicate
whether a frequency glide made a larger or smaller frequency transi-
tion than the difference between two following tones. They found that
the matched width was always smaller than the actual glide excursion
and that a short duration glide (100 msec) always matched a smaller
interval than a longer duration (300 msec) frequency glide with the
same amount of frequency change. In the second half of their study
they asked music majors to name the musical interval equivalent to the
extremes of frequency glides. The variables studied were the waveform
(sinewave or linear based) and the direction (up or down) of the

frequency changes in the glides. The direction of the glide had no

3
effect. Sine waveform glides, however, were consistently perceived as

wider than linear glides.

These results agree with the results of a frequency modulation
study by H. G. Seashore (1932). Music majors, with ear training, were
asked to name the interval corresponding to the extreme pitch
excursions of vibrato tones. The actual widths were consistently
underestimated, but the estimates increased with increases in the true
excursions. Clearly the entire frequency excursion of the vibrato
tones was not perceived by the subjects. These experiments show that
absolute frequency excursion is only one factor affecting the perceived
extent of a frequency change; also important are the rate of change of
the frequency and the form, or microstructure, of the frequency change
as well.

Some more recent work has focused on this inability to track the
actual excursion of frequency modulation and the effect of the
modulation waveform. A successful experimental technique has been to
directly compare frequency modulation by different modulation wave-
forms. One of the signals is varied in modulation width until the two
modulations produce sensations of equal frequency excursion. If the
waveforms had no effect on the tracking of the actual frequency changes
then at the point the two FM signals sound equally wide their
actual peak frequency excursions should also match. For all waveforms
A and B the peak excursions‘ﬂfA and :afB, are generally different
when the two types of modulation produce equal width sensations. The
point of subjective equality (PSE) for the ratio of the two frequency
excursions is defined by

Real/B) a AfA/AfB.

Divenyi and Hirsh (1971) compared triangle waveform modulation

a
(T) and trapezoidal waveform modulation (z) to a constant width square

wave frequency modulation (Q) and found PSEs of

Re(T/Q) ' 1.72 and

Re(Z/Q) = 1.07
for modulation rates between 4 and 12 Hz. These values indicate that
a triangle wave FM signal must have almost twice the frequency excursion
of the square wave modulation to sound equally wide. The trapezoid
waveform FM excursion need only be slightly larger than the square
waveform excursion for the two modulations to sound equally wide.

Hartmann and Long (1976) and Hartmann (1977) compared triangle

waveform PM with sine waveform FM (3) under a wide variety of conditions
and found that

Rea/S) = 1.22 1 .03
indicating that the triangle waveform FM must also be wider than sine-
wave FM for subjective equality for width. An interesting finding of
these studies is that for the range of widths investigated the sine
and triangle waveform modulations were indistinguishable. Subjects
trained to identify the triangle FM while watching an oscilloscope
trace of the modulation waveform could no longer perform better than
chance at distinguishing the sine and triangle modulation waveforms
when the oscilloscope was removed. More recent informal studies have
shown that the width of the frequency excursions must be increased to

Afs squo Hz before the waveforms could be identified.

Views of Perception. A perceiving individual is constantly receiving
vast amounts of information. In order to deal with this great flow the
mechanisms of perception must be efficient. Some data must be discarded

as irrelevant and the rest must be condensed to a manageable volume.

 

 

_..'
l—-I>-l
.0.

o

 

 

 

D
04-05
H
[—3.74
I—— 2,, —-1
O

6
The question of how the perceiver decides what to discard and in what

way to deal with the remaining information is the central problem of the
study of perception. In order for a subject to be able to decide which
stimulus had the wider frequency excursion we must assume that
information regarding some aspects of the variations of frequency over
time was saved. Higher levels of cognitive processing might then assess
the saved information to make the final width comparison required by

the experiment. Models attempting to explain the perception of
frequency modulation width must explain the nature of the process for
extracting information from the incoming FM signal. Ultimately, the
specific biological sequences involved must be revealed.

At the present we must settle for logical models that are intended
only to match the results of the biological processing. These logical
models capture the formal properties of the process of extracting
information from the stimulus. Good models must accurately predict
the responses of the subjects to the stimuli, but they must also
succeed at this while using the least amount of information (efficiency).
We believe that human perceivers must discard at least part of the
information in the stimulus, and while a model utilizing all of the
information of the stimulus may be able to accurately predict the
subjects' responses it need not closely reflect the type of processing
the biological mechanisms perform.

This paper looks at a type of model that represents a "spectral“
view of frequency modulation perception. In this approach to PM
perception the modulation waveform of the stimulus is analyzed into
its Fourier components and their respective amplitudes. The phase
relationships between those components are discarded as irrelevant.

In general, modulation width is determined by some combination of those

7
component amplitudes. This singular focus on component amplitudes,

totally ignoring phase information, is admittedly a narrow interpretation
of the spectral viewpoint.

If, in a spectral model, the phase information was also saved,
the entire stimulus would be completely specified by the retained
information. This would be equivalent to no reduction of information
from the stimulus. Therefore, because of the requirement to abstract
information from the input this type of spectral model would be
unacceptable.

All other models that might explain the subjects' responses will
be grouped together under the title "temporal" models. These models
attempt to eXplain the responses without recourse to the Fourier com-
ponents of the modulation waveform. Some may require less information
be stored about the stimuli than specified spectrally and some may
require more. In general, these temporal models will remain unspecified.
In these temporal models the form of the incoming signal is processed
and represented internally through an isomorphic mapping of the
retained temporal structure of the stimulus into its internal represent-
ation.
Models. The simplest model for comparing the widths of FM signals is
to compare the absolute frequency excursions of the modulation signals.
This is a temporal model. The single prediction from this model is

lieu/B) . 1.0

for all waveforms. None of the above studies found that value of Re to
hold.

Hartmann (1978) outlines three possible spectral models that might
explain the data. The first is based on the work of Kay and Matthews

(1972) who investigated detection of PM for varying modulation rates.

8
They found that modulation width detection threshold for PM at a speci-

fic modulation rate was increased by preexposure to superthreshold PM
at the same rate. This led them to the notion of multiple tuned chan-
nels for specific FM frequencies about an octave wide, much like
the channels described by Blakemore and Campbell (1963) for visual
spatial frequency. If this idea of tuned channels is correct its
extension to determining modulation width sensation might be as follows:
the auditory system performs a Fourier Analysis on the modulation
waveform and the amplitude of each frequency component that is above
threshold within its own channel becomes available to contribute to the
sensation of width. The simplest version of this could be labelled
a Fundamental Extractor model. In judging relative width only the
Fundamental components of the two modulation waveforms are compared.
When the two fundamental components are of equal amplitude there is a
sensation of equal width. The combination rule is to discard the
information from all components other than the fundamental. The
predictions of this model are shown in 'row 2 of‘Table 1. The model
predictions for sine-triangle modulation comparisons are very close to
the data. The triangle versus square waveform modulation prediction
clearly fails to come close to the data. The trapezoid versus
square modulation waveform predictions, however, are not grossly
wrong.

The second spectral model is labelled the Root Mean Square (HMS)
model. This is a combination rule for the harmonic components given by:

RMSm'éCE: anzf“ .

This form yields a single value for each modulation waveform and any

two stimuli with the same RMSPM values should sound equally wide.

Table 1. Predicted and observed Re-values.

Re(T/S) Re(T/Q) Re(2/Q)

Predicted Re
Model-Peak 1.00 1.00 1.00
FE 1.23 1.57 1.11
RMS 1.23 1.73 1.22
Observed 1.221 1.722 1.072

1. From Hartmann and Long (1976) and Hartmann (1977)

2. From Divenyi and Hirsh (1971)

10

In the case of the sinewave modulation compared to triangle
modulation, column 1, the RMS prediction is in excellent agreement
with the data. For square wave modulation compared to triangle
modulation the RMS prediction is also very close to the data. For
square versus trapezoid modulation the prediction is not so good and
clearly fails. The success of the above two models clearly warrants
their further investigation.

The final model outlined by Hartmann (1978) is roughly equivalent
to a low-pass filter system. The combination rule would be to differen-
tially weight the component amplitudes with the lowest components
receiving the most weight. Hartmann (1978) chose, however, to view this
model temporally by referring to its equivalent: the Lag-Processor.

In this interpretation the system "views" the stimuli through an
exponentially decaying time window. The further in the past the actual
frequency deviation occurred the less weight it is given for determining
modulation width at the present instant. The output of the system is
then evaluated for a peak that determines the modulation width.

Hartmann assumed that the time constant of the window was fixed.

Because of this the model predicted changes in Re with changes in the
modulation rate. Hartmann (1977) found no such changes and so the model
was ruled out with no further analysis.

The experiments described below are an attempt to specifically
test the spectral models. The presence of phase effects is the critical
test.

To facilitate comparisons to the previously mentioned studies
the data from Experiment I are first presented using the Re-values
obtained. This leads, however, to a graphical presentation that

appears to show a significant phase effect. To correct for this the

data are replotted using a different measure, Qe' Data from all

subsequent experiments are described using the new measure, Qe'

Experiment I

Work by Graham and Nachmias (1971) points to a paradigm that
seems especially suitable to the study of FM width perception. Graham
and Nachmias were testing single versus multiple channel models for
visual spatial frequency and chose a stimulus that had just two Fourier
components. This allowed access to two independent channels simulta-
neously using only three parameters: the amplitude of each component
and the phase relationship between them. An analogous set of stimuli
may be produced in the auditory domain. Specifically, an FM signal in
which the frequency modulation waveform has only two spectral components:
a fundamental component and its third harmonic. This choice of compo-
nents satisfies two conditions: they are more than an octave apart
and so not in the same channel for FM frequencies, if such channels
exist (1/3 octave channels: Kay and Matthews 1972), and they are
consistent with the previous studies using spectrally complex
waveforms in that triangle, trapezoid, and square waveforms only have
odd harmonics.

These complex modulation waveform FM signals may be compared to
sine waveform modulation FM to measure the Point of Subjective Equality
(PSE) for modulation width.

The Fundamental Extractor model predicts that the two modulation
waveforms will yield equal width sensations whenever the fundamental
component of the complex modulation waveform.and the sine modulation
waveform have equal amplitudes (Afs 3 Afl). This relationship should

hold for all phase relationships between the complex modulation
ll

12

.25—omega mung .~ 0.33m

 

$5. .9

 

~th .9

when; his

 

 

is»;

 

 

 

 

 

 

 

 

 

 

13
components (é) and for all levels of overall size. The RMS model

predicts equal width sensations whenever the RMS value of the

complex modulation waveform equals the RMS value of the sine modulation

wave. This prediction is also independent of phase and overall width.
The variation of the value of o thus serves to test the spectral

models. If changes in the measured PSE are found as the value of

changes the proposed spectral models may be rejected. Temporal

models will have to be found to explain the data.

Method.
Subjects. Subjects were volunteers selected for their ability to
match pitches correctly and quickly in a Method of Adjustment screening
task. These subjects also had to be available for the extended length
of time necessary to collect psychophysical-type data. These subjects
were paid for their time. The author also scored well on the screening
task and participated as a subject. Three subjects provided the bulk
of the data. Three other subjects were tested using a subset of the par-
ameters. These other subjects served to verify the responses of the
three main subjects.
Procedure. The relative perceived frequency modulation widths produced
by the two component complex modulation waveforms were measured by
finding the PSE for modulation width between the complex waveform FM
and sine waveform FM. The sine modulation waveform has only one
Fourier component and always matched the modulation rate of the
fundamental component of the complex FM modulation waveform.

To measure the PSEs the anplitude of the sine waveform FM (at; )

was kept constant while the amplitude of the complex waveform (Afc)

14
was varied using a simple single up-down staircase procedure. The

subject was asked to choose which of the two FM tones had the wider
modulation for each trial. If the complex FM was judged to be wider,
on the next trial the width of the complex FM (Afc) was decreased by
about 32. If the standard, Wine waveform PM was judged to be wider, on
the next trial the width of the complex FM was increased by about 32.
The starting point of the staircase was chosen arbitrarily and the
staircase was terminated after the subject reversed the direction of
the staircase 26 times (usually about 50 trials). The values of
the first four reversal points were discarded and the arithmetic mean
of the last 22 was used as the estimate of PSE. Each subject
performed a minimum of three staircase runs at each paramenter set
and the mean of these multiple PSEs was taken as the subjects' final PSE.
The width of the standard FM, Afs, and the waveform of the complex
FM as specified by the phase relationship between the two components,
¢, were the variables studied. PSEs were collected in random order
for each of the three values of the excursion of the standard,
Afa I 12.5 Hz, :SHz, and :10 Hz, completely crossed with eight values
of the phase relationship in the complex FM, 4’ ' 0, 11/4, n/Z, 311/4. 1!.
Sula, 3n/2. and 7n/4. This generated 24 PSE data points for each of
the three subjects. The three additional subjects were only given two
values of the standard excursion, Af. '2.5 Hz and 10 Hz, crossed with
four values of the complex phase relationship, é - 0, n12, n, and 3n/2.
Stimuli. The complex modulation waveforms used are illustrated in
Figure 2. The instantaneous frequency produced by these modulation
waveforms is given by

{C(c) - £0 + Afltsinann + e) + l/3sin(6nfmt + e 4- ¢ )3.

15
Anglers is a common phase angle chosen so that f§(t I 0) I fo (as for

the sine wave).

The amplitude of the third harmonic component of this modulation
waveform was fixed at 1/3 the amplitude of the fundamental component.
This matches the amplitude of the third harmonic of a square wave. By
this choice, the third harmonic component is kept at a reasonable level
relative to the fundamental and maintains a visually noticeable effect
on the oscilloscope trace of the modulation waveform.

The instantaneous frequency produced by the sine modulation waveform
is given by the following

fs(t) I f6 + Afssin (anmt).

The modulation rate fm, was chosen to be 4 Hz. This rate was
frequently chosen by the researchers mentioned above. Their choice of
4 Hz was probably determined by data of Zwicker (1952) showing subjects
to be most sensitive to FM at that rate.

The carrier frequency was about 800 Hz, but was varied from trial
to trial within the range of 775 to 825 Hz. This served to decrease
any effects due to local variations in sensitivity along the basilar
membrane, (Van den Drink, 1971, 1972).

Stimulus Production. The stimuli for all experiments were frequency
modulated sinewave signals. The sinewave carrier signal was produced
by a Havetek Model 116 Voltage Controlled Oscillator (VCO). The
frequency of the carrier was varied by control voltages from a
microcomputer, (Klein, Gable Edmunds, Eicher, & Hartmann, 1978). The
FM waveforms were calculated by the microcomputer and stored in memory.
The memory structure for those stored waveforms was a 256 sample,
16-bit signed integer vector holding values representing the

frequency deviations for a single cycle of the FM waveform. The

16
modulation rate, fm’ was determined by the rate at which the computer

stepped through the vector, resetting the VCO control voltage according
to the current element of the vector. For example, a 4 Hz FM could be
produced by repeatedly cycling through all 256 elements of the

vector once each 250 msec, presenting each element for 976 microseconds.
An 8 Hz FM would require a 488 microsecond sample time. The duration
of the FM signal was determined by the number of times the "playing“

of the vector was repeated. The amplitude of the signal was controlled
by a rectangular envelope produced by an Exponential Voltage

Controlled Amplifier (EVCA) also controlled by the microcomputer. The
calibration frequency (800 Hz) for the carrier was verified using a
frequency counter and the calibration amplitude (75, dB SPL) was
measured electronically using a Ballantine Model 320A voltmeter
measuring True-HMS. Subjects listened to tones with headphones

(Beyer DT-48) while sitting in a sound proof room.

Trial Structure. A trial consisted of two 2-second long tones presented

 

in a Two-Interval Forced-Choice (ZIFC) paradigm at 75 dB SPL and
separated by 250 msec. Both tones had the same carrier frequency and
modulation rate. One tone was always the standard sine waveform FM
tone with a fixed frequency excursion, Afs. The standard FM signal
was presented first with probability p I .5. The standard excursion,
Afs’ remained constant within a single experimental run, but was
varied as a parameter.

A green warning light lasting 250 msec would indicate the start
of a trial. The two stimulus intervals would immediately follow and
a different color light was turned on simultaneously with each
interval to indicate which button to push to choose that interval.

When the stimuli were finished a red light came on to request a

0mm

 

 

 

 

17

N .

 

.udgu sauna. .n my»:

gonm

 

 

 

 

meant.
I. Tull...

1; 2m mmabll muo

moz<mo 30... 15> FIG...—

20

we...
Tlllm NIWWWIII mNIIIIITI w _I...
.1 o
68 wzo._.

 

 

 

 

 

mm

18
response. The next trial began 750 msec after a response was made,

see Figure 3. When the criteria for termination of an experimental run
was met all lights were turned on simultaneously to notify the subject.
Results-Discussion. The results of Experiment I are shown in figures 4
through 8. The data shown in each graph represents values of R.
collected at each parameter set. Figure 4 shows the group results for
the 3 main subjects. The predictions from each of the 3 models are
also included; the peak detection model, the RMS model and the
Fundamental Extractor model are shown. The apparent phase dependence
of the EMS and Fundamental Extractor model predictions is discussed
below.

From the data shown there are two main dependencies, an effect
due to the size of the standard, Afs’ and an effect due to the phase
angle between the components of the complex modulation waveform, 6.

In all instances the peak detection model has obviously failed again.
The other two models do not fail as completely, but still appear
inadequate.

In the discussion above regarding the two models, the RMS and
Fundamental Extractor, it was pointed out that there should be no
effect due to phase, The calculated prediction, however, show changes
with ¢>that must be explained. The nature of the Revalue is the cause
of this apparent discrepancy. The Re-value is calculated from the peak
excursion of the waveforms. When adding together the two components
of the complex waveform, if the sizes of those components are kept
constant and only the phase between them is changed, the peak of the
resulting waveform will change. This effect can be seen in Figure 2
showing all of the complex waveforms used. The amplitudes of the two

components remained constant while the phase was changed to calculate

19

 

 

um e
1.2-

I.O‘

.s‘ .
.6‘

GROUP n=3

4.6

Figure as

 

‘\

\

\\ p
\“‘ ‘
.‘- fe
\

. rms
O

 

db
1
a
a
I
l
is
l

V

1V2 11' 3V2 O

lie-values. Group data Experiment I.

For this and all following figures Drepresents Af'I 2.5 Hz

Arepresents Af.I 5 Hz and 0 represents of.I 10 Hz.

20

 

 

 

l.4~ . . . . . , o ,

l.2*

LO' fen
rms

.8'

.6

susgecr BA
96 6 ' r22 1r ' 3192 0

Figure 5. lie-values. Subject 8 Experiment I.

 

l.4r r

21

 

U

l.2

LO

‘

 

.6'

SUBJECT M

 

 

do

Figure 6.

V

1&2

Re-values.

I
V

1r ' 3..)2

Subject M Experiment I.

L
I

O

 

L4

22

 

 

 

 

l.2'
LO-
.s» g .
.6-
SUBJECT W
qbo '22'1} '3«}2' 6
Figure 7. R e-values. Subject H aperiment I.

 

 

L4r

l2 - '
LO

T

T

 

SUBJECT c
0 7/2 1} 3772 0

L4 ,
l2

LO-
.8"

V

  

 

 

SUBJECT D

 

L4r

l2
LO .

.8'
R

 

i

 

SUBJECT E

 

 

4, c ”/2 1r 3772 o

Figun 8. ke-vslues. Auxiliary subjects marl-ant I.

24
and plot the various waveforms. For the predictions of the Fundamental

Extractor model, so long as the fundamental component has an amplitude
equal to the sinewave standard's amplitude the signals should produce
sensations of equal width. The predicted Re-values will vary with the
changes of the peak even while the fundamental remains constant. The
predicted values of the RMS model show the effect based on the same type
of analysis. Because the amplitudes of two components remain constant
the RMS value for the complex modulation also remains constant.

The graphs of the data show that the measured Re-values roughly
follow the curves of both of the models. It is difficult, however, to
determine the extent of the deviations from the predicted values because
of the curve. To better illustrate the actual phase independence and
observe the deviations from the predicted values the predictions were
recalculated and the data replotted in figures 9 through 13 using the
ratio of the amplitude of the fundamental component of the complex
modulation waveform, Afl, to the amplitude of sine waveform, Afs' Call

this Qe where

Af
Qe(cls) = R:

The predicted Qe-values for the two models are now constants.
The gross curve in the data has been eliminated. It is easier to
observe deviations from the predicted values. Two main effects are
still evident, the effect due to Af8 and now an effect in which the Qe-
values for 0 between 0 and 11' were lower than the Qe-values for 0 between
11 and Zn (i.e. ¢I 0), an asymetry centered at 4.: n. This effect
showed up quite strongly in the group data. Data for individual
subjects showed the same effect to varied extents. To test this

observation statistically 952 confidence intervals were calculated

|.4i

 

.6'

GROUP m3
4: O v #72 . 1r '31r/2‘

40*

60‘
°/o C
80‘

 

tool

Figure 9.

 

I

P L A 4 l J l

at

l

J

 

 

Q.-values. Group data Experiment I and II.

26

 

[4' I . ' I I ' ' ' 1

 

 

 

 

'12 b
l<>h- . ' ' . I “"1N3
. . ‘ . o—rms
Q . .
.8’
.6L
SUBJECT B l
4.6 ' 1r'/2' 1'r '31}/2' 6
40'
60+
°/oC
BO '
IOO *
l l a a l a a a l

 

 

Figure 10. Q.-values. Subject B Experiment I and II.

L4 -

|.2‘

 

27

 

 

SUBJECT M

4,5

40'

60+

‘7. C
80-

 

E

Figure 11. '

L

wu;22

Qe-values.

'00.. W

A

o

 

Subject M Experiment I and II.

L4 .

l2 '

|.O -

4O

60'

%C

80'

IOO '

Figure 12.

 

 

 

- - o—fe

. . I s . . I I ﬁrms
. C t

. ‘2’ 4”».

 

 

 

Q.-values. Subject H Experiment I and II.

Q

29

 

 

I T I
LOCI:
.95
.90 -
.8
SUBJECT G
.80 ' = 4'

 

0-h-

l i
O 1r/2 1r 31r/2

 

 

 

 

 

 

 

sob .

.35L L Asusaect'r D

no r . . '

LO ' .. ..

LOO ; -

.95b .

sob . ° ,

ssh . . SUBJECT E . J
o 1r/2 1r 31r/2 o

4.

Figure 13. Q.-vslues. Auxillary subjects Experiment I.

30

for the difference between the mean of pooled data from 4>I 511/4, 311/2,
and 7n/4 compared to pooled data from A: 11/4, 17/2, and 3rr/4 for each
subject. Table 2. shows these confidence intervals. These confidence
intervals are based on the 95th percentile points of the t-distribution.
Thus if a confidence interval does not include 0.0 then a hypothesis
that the observed difference equals 0.0 must be rejected. From these
confidence intervals it can be seen that Subject B only showed a
statistically significant asymmetry for data at Af8 I 12.5 Hz. Both
Subjects M and w only showed significant asymmetry at ifs I 15Hz.
Other asymmetries seen in the graphs are probably not significant due
to variance in the data. Subjects B and M do show the asymmetry when
data from all values of Afs are pooled. The data pooled for all
subjects shows significant asymmetry at Af8 I 12.5 and 15 Hz.
Again, it is the variance of the data at Afs = 110 Hz that probably
prevents statistical significance. This asymmetry is an effect that
may be predicted by the temporal interpretation of the Lag-Processor
model discarded by Hart-inn (1978). Neither of the other two models
predict this. It is this effect that provides the motivation for
Experiment IV.

The other major effect that is evident in both the Re and Qe
transformations of the data is the change toward higher values as
Afs gets smaller. This is clearly visible in the data for Subjects
B and H. Table 3. shows 952 confidence intervals for the differences
between the data at each value of Afs grouped across values of' 9.

Positive differences here indicate an increase in the average

Qe-value as Afs decreases. Only Subject M showed a nonesignificant

increase and only between data at Afs = 12.5 and 15 Hz. It should

31

A 3:73.». 1. 3.5.3.0 + 3.51on V m: a To

H00 : ~00 s 0
NS. v a v «3. «8. v o v 8a.- 20. v a v m3. «8. v o v So. 96.8
8o. v o v So... So. v a v 25... m8. v o v «8. So. v o v «8... 2
31 v o v 25. n3. v a v 8a.- as. v a v Ra. H2. v a v So? 2
n2. v a v 8o. .5. v av can... So. v av So? «.3. v o v So. a
sooﬁasm
2825 a: S a: m a: m.~ a was

if a 2:.” it... u a n8 3% 833 «38> a?" a .2: :1: L.
you «use ooaoom commune neocowommwo wow m~o>woucw accordance Nmo .N canoe

32

Table 3. 952 confidence intervals for differences between

Qe-values across levels of 41’s-

5 "‘ Qe(Afs=-12.5) ' Qe(Afs=j'_5) ° ' Qemfstﬁ ' Qe<Af,':1°>
Subject
3 .020 < 5 < .122 .038 < 5 < .141
M -.935 < 5 < .045 .018 < 5 < .120
w .052 < 5 < .107 .019 < 5 < .054
Group .028 < 6 < .076 .040 < 6 < .088

 

7

QeCAfs) g IlenEOQeG’In/lu Afs)

33
also be noted that as the Qe-values increase they approach the predicted

values of the two main podels. Neither of the two models predict this

5f, effect. Clearly simple versions of the EMS and Fundamental

Extractor models are inadequate. The larger the overall width of the sti-
muli the less accurate these models become. This effect motivates
Experiments II and III.

At this paint it seems wise to stop and summarize the preceding
results before examining their implications for the question of
spectral versus temporal processing.

1.) The fine structure of the modulation waveform as determined

by 6 has a significant effect on the Qe-values. For- 0
between n and 2n the Qe-values are increased.

la.) The asymmetry appears to be largest at larger values of Afs,

but is not significant at Afs I 110 Hz due to increased
variance of the data collected.

lb.) Purely spectral models cannot adequately explain this effect.

This effect is not large, however, and some spectral model
may serve to approximate the data.

2a.) The overall size of the waveforms as determined by his has

a significant effect on the Qe-values. As Af8 decreases the
Qe-values increase.

2b.) In the limit of small Afs the RMS and FUndamental Extractor

models fit the data approximately.

Only a model taking phase into account can predict the detailed
phase-dependent structure of the data. Combination rules for spectral
components alone cannot handle these effects. This provides evidence

for a temporallybbased view of this process. The fine structure of the

34
modulation waveform is processed and affects the perceived width of

the modulation. For a temporal model the rate, the magnitude, and the
form of the change all become possible parameters for determining the
perceived width. All of these parameters are affected by the changing
6. As 0 changes the peak excursion changes as described above, the
rate of change of frequency changes for different segments of the
waveform, and the form changes. All these changes are seen in
Figure 2. This confounding of independent parameters makes it impossible
to specify exactly what is the basis for the phase effect.

The Afs dependence does not apply as clearly to the temporal
versus spectral processing question. Within the spectral model it is
clear that each component affecting the perceived width must be above
threshold for detection. As Afs increases the size of the complex
modulation waveform increases to find a new PSE. This then has the
effect of increasing the absolute amplitude of both of the components of
the complex modulation waveform. As these components increase in size
it is quite reasonable that perhaps the third harmonic makes the
transition from below detection level to being well above threshold.
If this is the case, then at Afs I 12.5 Hz, only the fundamental comp
ponent is available to contribute to the width judgement. The
Fundamental Extractor model predictions would match the data regardless
of whether the real process involved more than one harmonic or not.
The fact that the Qe-values do change as Afs increases is evidence against
the Fundamental Extractor model, but does not eliminate some other
combination rule for the contribution of the modulation waveforms to
the perceived width.

For the other spectral model, RMS, it is clear that as the amplitude

of the third harmonic increases to exceed threshold the “perceived“

RMS value should more closely approach the actual RMS value of the
modulation waveform. The predicted EMS value is based on the actual
components of the modulation waveforms. The data show that the perceived
relative width deviates from the RMS model predictions as Afs increases.
This indicates only that EMS is not the appropriate combination rule for
the components on the modulation waveforms. It is possible that some
other spectral type model may accurately predict these results.

The Af' dependence may also be interpreted within the temporal
view of perception. It is reasonable to assume that the possible
temporal parameters alluded to above have some type of threshold below
which they cannot be processed. If this is the case, as Afs decreases
some of these waveform.characteristics may lose some of their ability
to affect the perceived modulation width. At large Af8 many parameters
determine the perceived width and at small Afs the perceiver must rely
upon only those waveform characteristics still above threshold. If none
of the characteristics are detectable, no modulation would be perceived

at all.

Experiment II

 

In Experiment I theAf8 dependence pointed to some process that
took into account either the detectability of the third harmonic in
the spectral models,(without specifying the component combination rule),
or the detectability of the microstructure of the waveforms as necessary
for the temporal models. A natural question to arise is 'to what degree
are listeners aware of the factors influencing the difference in
perceived width between the complex.and the sine modulation waveforms?‘

Can the subjects distinguish between the two modulation waveforms to
35

 

36

the same degree that the perceived relative width differs from some
simple model for width perception? Does the discriminability between
the waveforms go down as.Af3 decreases? Is there an asymmetry between
the discriminability of the waveforms for 0 on either side of w 7

Experiment II measures the discriminability of the waveforms used
in Experiment I. To accurately measure the degree to which the same
factor influencing width comparison affects discrimination it is neces-
sary to eliminate as many extraneous cues as possible. It is necessary
to equalize the stimuli on as many characteristics as possible. The
stimuli used were the same stimuli as were used for Experiment I. That
means the stimuli are at the same amplitude and the same center
frequency. In order to eliminate perceived width as an identification
cue the width specified by the PSEs collected in Experiment I were
used. That meant that the only likely cue for discrimination must be
the third harmonic and/or its effect on the structure of the modulation
waveform.

A spectral view of the discrimination process involves simple
detection of the third harmonic component. If it is detected the
waveforms will be discriminated. There should be no phase effects in
the discrimination data. Any phase effect that might be found would
be evidence for rejecting the spectral view.

Method. The task for this experiment was to identify the complex wave-
form modulation signal when presented two FM signals. One was always
sine waveform modulation and the other was complex waveform modulation.
The two signals were presented in a ZIFC paradigm where the order of
the two signals was randomized. The trial structure was identical to

that described above for Experiment I. The modulation excursion, Afs

37
and Afc, were chosen according to the group PSE for the three subjects
receiving the full stimulus set in Experiment I. The same three subjects
participated in this experiment. The group PSE was used so that more than
one subject could participate at a time.

Subjects were trained in this task in order to eliminate learning
effects. Data reported is from 100 trials collected after it was deter-
mined that the subjects had reached asymptotic behavior. The relative
widths of the two modulation signals remained constant throughout the
trials according to the group PSE. Only the center frequency changed
from trial to trial as described above for Experiment I. The percent
of correct discriminations for each subject at each value of 5 and Af
was collected.

Results-Discussion. The results of Experiment II are shown in the
bottom half of Figures 9 through 12. Figure 9b shows the group data.
There was a large effect due to the overall width, Afs' As Af3 decreased
there was a decrease in the ability of the subjects to discriminate the
two modulation waveforms. The group data at Afs = 110 Hz also showed

an effect due to phase. Both subjects B and N showed this effect in
their own data while subject M did not. Subject M was clearly too good
at this task. Presumably, ifLAfs had been decreased slightly subject

M would have shown this phase effect. Unlike the data of Experiment I
there is no asymmetry in the group Phase effect.

In general, a comparison between the curves for Experiment I and
II shows loose similarities. when pairs of points at a given value of

the phase relationship are relatively close together or even reversed

in order in Experiment I the same points from Experiment II are also

relatively close together. Even some of the idiosyncratic changes in

direction along the curves in Experiment I are paralleled in Experiment
II. If it is the case that the detectability of the third harmonic is
constant across values of 5 within a single/ifs level then those parallels
are evidence against the spectral view and in favor of a temporal model.
It does show that the same features influencing perceived width also
influence waveform discrimination.

The phase effect for the two subjects at Afs I 110 Hz provides
evidence that temporal aspects of the modulation waveform are being
processed as the subjects perform the discrimination. However, because
the stimuli of Experiment 11 were equalized for width it is not the case
that the amplitude of the third harmonic remained constant across values
of'iﬁ. This might mean that the apparent phase effect is due to changes
in the levels of the 12 Hz component rather than the changes in the
microstructure. Experiment III looks at this possible problem more

closely.

Experiment III

 

In the specification of the modulation waveforms used in Experiment
I it is evident that a single element distinguishes the standard sine
modulation waveform from the complex modulation waveform: the presence
of the third harmonic. Within the spectral view it is the presence or
absence of this additional component that initiates the process to
modify the perceived FM width. In the temporal view it is not the
detection of the third harmonic, but the detection of the changes induced
in the modulation waveform by the third harmonic that are important.
This distinction is quite subtle, but it is this fact that allows the
phase relationship between the two components of the complex modulation

waveform to have an effect in the temporal models, but not in the
38

39
spectral models.

Obviously, the detection level of the third harmonic should be
correlated with the detection level of the induced alterations in the
modulation waveform. It is not necessarily the case, however, that
the induced alterations are detectable if and only if the third harmonic
is above threshold. Specifically, the microstructure of the waveform
may affect perception even while the third harmonic is below threshold
or the third harmonic may be well above threshold, but its effect on the
microstructure such that it is difficult to distinguish the presence
of the alterations in the waveform. Applied to Experiment II, this
notion of independent thresholds may be easily tested. With specific
values for the detection threshold for the third harmonic component of
the modulation waveform predictions can be made regarding the discrimination
performance of each subject. Experiment III was the measurement of the
detection threshold for 12 Hz sine waveform frequency modulation. The
amplitude of the modulation waveform measured in Hertz at threshold was
the desired result. The threshold for 4 Hz FM was also measured for
completeness.

If the detection level of the third harmonic component of the
modulation waveform cannot accurately predict discrimination performance
spectral type models can be rejected as inadequate for modelling this
task.

Method. The paradigm and trial structure was identical to that described
for Experiment I except for the following modifications. The complex
waveform modulation signal was replaced by a constant frequency tone
having the same frequency as the center frequency of'sine waveform modue
lation. The width of the sine waveform modulation varied in a transformed

up-down procedure (Levitt, 1970) that converged on the p I .707 point

40
of the psychometric function for detection of the FM. If the subject

identified the interval with the FM twice in a row the width of the FM
was decreased by about 1.52 on the next trial. If the subject failed to
identify the FM on any trial, the PM was increased in width by about 1.52
on the next trial. The mean of the last 22 reversal points again served
as the estimate of the point of convergence for that experimental run.
For the purposes of this short experiment the FM width at which the FM
was detectable at the p I .707 level was considered to be the detection
threshold.

Subjects did a minimum of three experimental runs at both 4 Hz and
12 Hz. The mean of these multiple estimates served as the final estimate
for each subject. The standard error of those means provides a measure
of their variability.
Results. The FM detection levels for each of the three subjects are
shown in Tableli. The 12 Hz data are also shown plotted as straight
lines in Figure 14 relative to the actual size of the third harmonic
for each parameter pair used in Experiment II. As shown, subjects M
and W had relatively similar thresholds. Both were lower than the
threshold for subject B and between the third harmonic levels for
Afs I :;,5 Hz and ofs ”.15 Hz. Subject W had the lowest threshold and
Subject 8 the highest, almost equalling the level of the third harmonic
for Afs I 110 Hz.

Discussion. With the combined data of Experiments II and III it is

 

possible to look at the question of whether the waveform microstructure
has effects independent of the threshold of the third harmonic. This is
a direct test of spectral type models. If the level of the third harmonic
component relative to the individuals' threshold predicts the discrim-

ination performance it will not be possible to reject the spectral type

41

 

-le AAAA
D u U '3 D

D
D
D
DD
D

 

 

 

 

 

’2‘ Do U

SUBJECTB
21000000000
|.
oqiLAaAAAAAArA
JUDDDDDDDD
-|

SUBJECTM
2-000000000
l1

AAAAAAA.AA

O'Wufuw
-IJ SUBJECTW

 

 

' I I I I t I 1—_'I
cp 0 1r/2 1r 31r/2 0

Figure 14. mlitude of 12 Hz comment relative to thresholds.

 

 

42

Table 4. FM detection levels in Hertz for 4 and 12 Hz modulation rates.

 

Subject Mean Std. Error
Modulation rate
B 1.846 .165
4 Hz M 1.831 .094
W .788 .021
MEAN 1.488
B 2.416 .193
12 Hz M 1.244 .184
W .926 .165

MEAN 1.529

43
models. If the discrimination results cannot be predicted spectral

type models can be rejected.

From Figure 14 we see, then, that subject W should be able
to perform the discrimination task with virtually no errors for
Afs I 15 and 1 10 Hz. He should perform at the chance level for
Afs I 12.5 Hz. Subject M should perform with no errors at Alf8 I 110 Hz,
only slightly better than chance at Afs I 15112 and at chance for
Afs I 12.5 Hz. Subject B should perform only slightly better than chance
for Afs I 110 Hz and at chance for both Afs I 15 Hz and Af‘3 I 12.5 Hz.
For subject W (Figure 12.) the order of the results matches the predic-
tions but the fine details are unaccounted for. For Afs I 12.5 Hz, the
discrimination data shows essentially chance performance. It might be
noted that the extremely small variations in the level of the third
harmonic are paralleled in the discrimination data. This seems to be a
fortuitous effect attributable solely to chance. The level of perfor-
mance is so low that most variations must be due totally to random
error. At Af8 I 15 Hz Subject W did not perform without error as
expected. The variations in performance were large enough to rule out
the possibility of being completely due to random error and were
unmatched by the fine structure of the actual levels of the third har-
monic. At .df8 I 110 Hz the performance was nearly errorless with a
large decrease as 5 approaches n. It might be argued that this is due to
changes in the size of the third harmonic, but at ¢I n the third har-
monic is at its largest and certainly this should not predict a decrease
in performance.

Subject M performed much like predicted (see Figure 11). At
5fs I 12.5 Hz the data are at chance level. At 5fs I 15 Hz the data are

much more variable and just slightly above chance. At Af8 I 110 Hz

 

performance is virtually errorless.

Subject B (Figure 10) also performed nearly as expected. For
Af s I 12.5 and 15 Hz the data are very close to chance level. For
Afs I 15, however, at 5 approaching 0 there seemed to be a consistent
increase in performance. This is difficult to explain with the third
harmonic so far below threshold for these stimuli. At Afs I 110 Hz there
is again shown an unexplainable drop in performance when the third
harmonic is at its highest amplitude.

Clearly, at least for subjects B and W there is more being processed
than the amplitude of the third harmonic. The temoral structure of the
modulation waveform must be affecting the discrimination process. the

spectral-type models can be rejected for this task.

FDcperiment IV

 

As described in Experiments I, II, and III there is a significant
dependance of modulation perception on the phase relationship between
the two components of the couplex modulation waveform. Experiment I
showed an asyrmnetric phase effect about 4’ I 11. Experiments II and III
showed a phase effect that is symetric. These effects cannot be
explained by spectral-type models. Experiment IV is an attempt to test
a specific temporal model, the l-pole Lag-Processor model.

An informal visual analysis of the waveforms (Figure 2) might lead
one to guess that if a phase effect is going to be found it would be
symetric about either J I 11 or 41. O, the "extremes“ of the waveforms.
The waveforms at intermediate values of 4? seem to form symetric
progressions from one extreme to the other. As a consequence of changing
0 the relationship between the peaks of the third harmonic component

and the peaks of the fundamental component of the complex modulation
44

 

45
change. The variability in the location of the irregularity in the

summed modulation waveform is due to that changing relationship. .As ¢
approaches n this irregularity seems to move further away from the
"global maximum", occuring progressively earlier. For ¢ greater than n,
the irregularity is on the trailing side of the global maximum and
seems to move closer to the peak as 6 increases. The actual value for
the global maximum also changes with 6 , but this change is symmetric
about ¢>I w and would not be expected to have an asymmetric effect.

The Lag-Processor model, that Hartmann (1978) discarded, does
predict an asymmetric phase effect and so provides a place to begin
further investigation. By this model, the ear takes a weighted average
of all frequency deviations occuring during a moving "time window" of
constant "length". The frequency deviation occuring at the leading
edge of that window, (113. the present instant), is given a maximum
weighting. All deviations occuring earlier in time are given
progressively less weight, an exponentially decaying function of time.
All deviations occuring at the same relative time prior to the leading
edge of the moving window are given the same weight. The sequence with
the largest maximum in its moving average would be perceived as having
the widest frequency deviation.

This model predicts that a sequence of frequency deviations with a
small peak preceding a large peak will yield a higher maximum in its
moving average than a sequence of frequency deviations where the order
of those peaks is reversed. The result depends, however, on the length
of the time window. When the pattern of weights decreases to zero too
quickly only one of the peaks may be captured within the window. In
this case, since the size of the large peak is the same for both

sequences, the maximum moving average would also be the same. As the

46
length of the time constant increases to include more of the second

peak the difference between maximum moving averages will increase. The
difference will increase to some maximum and then begin to decrease as
the time window continues to lengthen. The difference will decrease as
the two peaks become more similar due to the time constant increase.
Figure 15 shows the relationship between an exponentially decaying

time window and the complex waveform for OI 11/4 and 4): 711/4 for

fm I 2 Hz, 4 Hz, and 8H2. The time constant is based on data collected
by Kay and Matthews (1972).

Kay and Matthews (1972) collected data showing thresholds for
detection of FM at various modulation rates. The response curve
matched what might be expected if PM was processed through some type of
low-pass filter. The thresholds for slow modulation rates were quite
low. Those thresholds increased as the modulation rate increased.
Those data can then be used to determine the cut-off frequency of the
lowbpass filter. From that cut-off frequency the time constant of the
l-pole Lag-Processor can be determined. We estimate that time constant
to be 40 msec.

The time window in the figures is drawn to give maximum weight to
the leading peak. For 6. 11/4 this is near the point at which the maximum
average is achieved. Because of the increasing time separation between
the peaks as fm goes from 8 Hz to 2 Hz, the preceeding peak is given
progressively less weight. By informal visual analysis of these graphs
it appears that the largest differences between the maximum moving averages
will be produced by fm I 8 Hz. At fm.. 2 Hz the computed average is
obtained almost entirely from a single peak. The mechanism will obtain
its maximum output when the leading edge is at or near the maximum

peak for both waveforms. This should yield a relatively small

47

in-BHz

 

 

 

 

 

 

 

 

Figure 15. Exponential windows swerimsod on modulation waveforms.

48
difference in perceived width between the otherwise symmetric wave-

form pair.

The Qe-value provides a measure of relative width. The Qe-value is
computed as a ratio of actual frequency deviations at PSE for width. A
smaller Qe-value, therefore, indicates that a particular modulation
waveform is perceived as relatively wider than a sine based FM with
an equal actual frequency excursion. Since all complex FM waveforms
were compared to equal-width-sine-waveform modulation, Qe-values may be
compared across phase angles. The asymmetry of Experiment I can, there-
fore, be roughly explained by this application of the Lag-Processor
model.

This relationship between the parameters of the peaks and the length
of the time constant may be investigated by manipulating any of the
following variables: the length of the time constant, the time
separation between the peaks, the size of the peaks, the order of the
peaks, and the pattern of the weights within time window. Changing
any one or any combination of these variables will affect the size of
the moving average derived from the time window. In the context of
the human auditory system, however, we assume that the time constant
and the pattern of weights within the time window are fixed.

Experiment I varied the order and relative sizes of the peaks of
the waveform. To further test this model Experiment IV varies the time
separation between the peaks of the waveform relative to the fixed
length of the time window to the auditory system postulated by the
model. This is accomplished by varying the modulation rate, fm.

This provides a test of a temporal-type model. If the time constant
derived from the Kay and Matthews (1972) data is correct for these

subjects, the Lag-Processor model predicts that the differences

49
be tween the Qe-values obtained at each waveform group should be relatively

larger for fm I 8 Hz than for fm I 2 Hz. Qe-values for f>11 should
continue to be larger than for (b <11.

Because of the decaying weight pattern it is the case that the
effect of the smaller peak on the moving average will decrease rapidly
as the small peak decreases in size. This leads to the additional
prediction that the difference between the Qe-values at 4 I u/4 and
0 I 7w/4 should be larger than for ¢>= 3n/4 and 5w/4, at fm I 8 Hz.
This effect is not expected for fm I 2 Hz because the time constant
should be too short relative to the structure of the modulation wave-
forms. This is the same reason for the predicted differences between

8 and 2 Hz FM above.

Method. The method for this experiment is identical to Experiment I
with the following parameter changes:
1.) Two values of fm were used, fm I 2 Hz and 8 Hz.
2.) Only one value ofAfs was used, Afs I115 Hz.
3.) Only 6 different phase angles were used:
1 . 11/4, 11/2, 311/2, 511/4, 311/2, and 7111/4.

and 4.) Only subjects B and M participated.

Results - Discussion. The measured Qe-values are shown in Figure 16.

 

It is quite obvious that the expected asymmetries did not occur. Table 5
shows the 952 confidence intervals for the difference between the

data at large values of 41 versus small 4: values. The Afs3 I 15 Hz data

of Experiment I are also included for comparison. Neither subject

showed a significant asymmetry for fm I 8 Hz, while Subject B showed

an asymmetry for fIn I 2 Hz. The predictions were not matched by the

50

 

j

'10
< 1

 

|.O -

 

 

 

 

 

 

 

a - 2 Hz

.8 ' SUBJECT B o - 8 Hz ..

Q : g f t : 1
¢ 1r/4 17/2 317/4 517/4 317/2 777/4

LI 1 '
I I.O - 1

.9 - ‘

% SUBJECT M
.8 " l * A A I A a

Figure 16. Q.-values. Experiment IV data.

51

Table 5. 952 confidence intervals for differences in Qe-values

for ¢<11 versus 6>11 .

Modulation rate:

2 Hz 4 Hz 8 Hz
Subject
B .021 < 6 < .115 -.010 < 6 < .060 -.071 < 6 < .073
M -.035 < 6 < .139 .037 < 6 < .113 -.022 < 6 < .062

6 g Qe(4>>1'1)- Qe(4><11)

 

52
data. It must be noted, however, that those predictions depended

heavily upon the time constant obtained from the Kay and Matthews (1972)
data. It might be the case that the individual variations in that time
constant are so great as to yield the asymmetries that were seen. For
Subject B, then, the time constant might be very long. This would shift
the asymmetry effect down to a much lower modulation rate. For the
higher modulation rates no asymmetries would appear because the entire
modulation waveform may be too close to the leading edge of the time
window. For Subject M the same might also be true, but the effect
centered around a slightly shorter time constant. If this interpretation
is correct this procedure provides a way of discovering an individual's
time constant. This conclusion must be verified by some other method of
estimating individual time constants. If this cannot be verified this
data serves as evidence against this specific temporal model. Some
alternative must be devised.

The original time constant was generated from FM detection
threshold data. A small portion of the necessary data has already
been collected for these subjects in Experiment III. With only 2
modulation rate detection levels collected for each of the two subjects
no quantitative measurements can be given. It is possible to work
"backwards” from the data of Experiment IV and predict that Subject B
should have a higher threshold for fh I 4 Hz than Subject M and a much
higher threshold at fm I 12 Hz. This is verified by the data. Subject M
should not have a lower threshold at fm I 12 Hz than at fh I 4 Hz, however.

The second prediction must now be modified in light of the above
discussion. If the time constant variations cannot be further verified
the prediction of smaller differences at 4;. 4n/4 and 5n/4 has no basis.

If the model were found to hold up these predictions would also be

adjusted by the individual subjects' time constant. For Subjects B and M
this means that more than one modulation cycle will be captured within
the window for fm I 8 Hz. Without a specific measure of the time constant
it is impossible to predict the differences across phase angles.

It might be noted from Figure 16 that at fh I 2 Hz for Subject 8
the difference between Qe-values at diI n/4 and 7w/4 is small compared
to the difference at 4) I 311/4 and 511/4. At fIll I 8 Hz Qe for OI 711/4
may be larger that at (is 11/4, but Qe at 0 I 511/4 is actually considerably
less than for dlI 3n/4. These differences are not predicted at all by the
original time constant. For Subject M the differences between Qe-values
at ¢I 11/4 and 711/4 are quite small while at d9 I 311/4 and 511/4 the
differences are relatively large. This is the case for both modulation
rates. This is also not according to the predictions based on the
original time constant.

The l-pole Lag-Processor model appears to be inadequate to explain
all of the structure in the data. More detailed measurements of
individual time constants may provide better parameters to help this

model, but those adjustments will probably not be enough.

General Discussion and Conclusions. This series of experiments has
been an attempt to discover the character of the processing underlying
our perception of frequency modulated auditory signals. Given the early
findings that the human auditory system does not follow the frequency
deviations strictly, some process other that simple frequency following
must be utilized.

There are two approaches to modelling this perceptual process. The
first is labeled the spectral view in which some combination rule for the

independantly detected Fourier components of the modulation waveform
53

54
determines the perceived modulation width. This takes place without

processing information regarding the phase relationships between the
components. The other view is a catch - all category including all models
that do not rely upon the Fourier analysis of the waveforms. This

group of models makes use of the actual time-varying microstructure of
the modulation waveforms and is called the temporal view.

Recent work has outlined a technique for obtaining a useable measure
of perceived modulation width. Use of this method has accompanied tests,
primarily of spectral models for explaining the acquired data. A major
problem has been the use of spectrally complex modulation waveforms.

The presence of many spectral components makes interpretation of results
very difficult. This study has avoided that problem by using only
spectrally simple modulation waveform.

Experiment I was a measurement of the effect of phase angle and
overall amplitude on the perceived width of complex waveform FM. A
significant asymmetric phase effect was found providing evidence against
the spectral-type model. This finding was further investigated in
Experiment IV. The second finding was a dependence of relative width
on overall excursion of the FM waveform. This could be interpretted
both spectrally and temporally. Experiment II was to investigate this
effect further.

Experiment II measured the discriminability of the various complex
FM waveforms from sine-waveform FM. The waveforms used covered the same
parameters as were used in Experiment I. It was found that discrimination
performance improved as overall modulation excursion increased, but that
it varied with the phase relationship between the complex waveform
components. These effects indicated that the same process mediating

FM width perception also played a part in FM waveform discrimination.

 

55
It was unclear, however, whether or not the phase effect was due

to the variations in the size of the third harmonic necessary to equalize
the FM signals for equal width. It was determined that the relevant
factor would be the deviation of the third harmonic amplitude relative

to the subjects' threshold for 12 Hz FM detection.

Experiment III was the measurement of the individual subjects' FM
threshold for the third harmonic modulation rate. Application of this
data to the data of Experiment II showed that the deviations from
threshold of the third harmonic component explained the large, overall
size effect, but were diametrically opposed to the changes in dis-
crimination performance associated with phase angle. This led to the
further rejection of the spectral-type model of FM width perception and
waveform identification.

Experiment IV was a test of a l-pole Lag-Processor temporal model
for explaining the asymmetry of the phase effect in Experiment I.

Using a time constant drawn from.data of Kay and Matthews (1972) as a
parameter the model failed to predict asymmetries in width perception
for 2 values of modulation rate. Alterations of the time constant
parameter would correctly explain the asymmetries, but could not deal
with the fine structure of the data. This particular temporal model was
rejected.

In summary, spectral-type models of FM width perception and
waveform discrimination were completely rejected. The Lag-Processor, a
specific temporal-type model was found inadequate to explain the

asymmetric phase effect in the width perception data.

LIST 01" REFERENCES

LIST OF REFERENCES

Blakemore, 0.3. & Campbell, F.W. 0n the existence of neurones in the
human visual system selectively sensitive to the orientation and
size of retinal images. Journal of Physiology, 1969, 292, 237-260.

 

Van Den Brink, G. Experiments on binaural diplacusis and tone perception.
In R. Plomp & G. Smoorenburg (Eds.), Frequency_analysis and
gperiodicity detection in hearing. Leiden: A. W. Sijthoff, 1971.

 

Van Den Brink, G. The influence of fatigue upon the pitch of pure tones
and complex sounds. Presented at Symposium on Hearing Theory,
Eindhoven, Holland, 1972.

Deutsch, D. Octave generalization and tone recognition. Perception and
Psychgphysics, 19729.11! 411-412.

 

Divenyi, P.L. & Hirsh, I.J. Pitch changes in trills and vibrato.
Journal of the Acoustical Society of America, 1972,‘§1, A138.

Dowling, W.J. The perception of interleaved melodies. Cognitive
Psychology, 1973,.2, 322-337.

 

Graham, N. & Nachmias, J. Detection of grating patterns containing two
spatial frequencies: a comparison of single-channel and multiple-
channels models. Vision Research, 1971, 11, 251-259.

 

Hartmann, w.M. Five experiments on frequency modulation width perception.
Journal of the Acoustical Society of America, l977,§l,$50.

Hartmann, W.M. Perception of frequency modulation width. Unpublished
manuscript, Michigan State University, 1978.

Hartmann, W.M. & Long, K.A. Time dependence of pitch perception-
vibrato experiment. Journal of the Acoustical Society of America,
1976,.91, SSO.

Kay, R.H. & Matthews, D.R. 0n the existence in human auditory pathways
of channels selectively tuned to the modulation present in
frequencybmodulated tones. Journal of Physiology, 1972, 225,
657-677.

Klein, M.A., Gable, G.A., Edmends, D.L., Eicher, D.A., & Hartmann, W.M.

Microcomputer control for psyChoacoustic experiments. Journal of
the Acoustical Society of America, 1978, ﬁg, 863.

56

S7

Kock, W.E. Certain subjective phenomena accompanying a frequency
vibrato. Journal of the Acoustical Society of America, 1936,
8' 23-25.

Lewis, D. Cowan, M., & Fairbanks, G. Pitch and frequency modulation.
Journal of Experimental Psychology, 1940,.31, 23-36.

Nabelek, I.V., Nabelek, A.D., 8 Hirsh, I.J. Pitch of sound bursts with
continuous or discontinuous change of frequency. Journal of the
Acoustical Society of America, 1973, 23, 1305-1312.

 

Seashore, C.E. (Ed.) The vibrato. University of Iowa Studies in the
Psychology of Music (Vol. 1). Iowa City: The University Press,
I§§2.

Seachore, C.E. The psychology of the vibrato. University of Iowa
Studies in the P5 hology of Music (Vol. 3). Iowa City:
The University Press, 93 .

Seashore, H.G. The hearing of pitch and intensity in vibrato.
In C.E. Seashore (Ed.) University of Iowa Studies in the P3 holo
of Music (Vol. 1). Iowa City: The University Press, 1932.

Sundberg, J. Vibrato and vowel identification. Archives of Acoustics,
1977, 2, 257-266.

 

Sundberg, J. Effects of the vibrato and the 'singing formant' on pitch.
Proceedings of Musicologica Slovaca, VI, Bratislava, 1978.

Zwicker, E. Die grenzen der horbarkeit der amplitudenmodulation und der
frequenzmodulation eines tones. Acustica, 1952, 2, ABlZS-ABl33.

Zwicker, E. Temporal effects in psychoacoustical excitation. In
A.R. Moller (Ed.) Basic Mechanisms in Hearing. New York:
Academic Press, 1975.