Octaves' Perception as a Window
in to Hearing Physiology
PREFACE
I, as a practicing ENT physician,
have yet to see any one using a hearing aid, unlike the eye glass users,
quite satisfied with his or her hearing device. I believe the reason for
this is that the knowledge of the hearing physiology still has some short
comings, and that is why I undertook this study. I have tried to make this
writing as simple for the non- professionals to understand as possible,
and for that I have included some basic explanations which might sound boring
to the professionals. I also have enclosed the texts of my references for
any one who might want to study them. Farhang Bakhtiar M.D., Nov. 2006
Sounds are produced by vibration
of objects, and each simple sound has a frequency which determines its pitch.
Compound sounds are made up of superimposition of few or more simple sounds,
each with its own specific frequency, and what the ear hears is the mixture
of all of them. When these frequencies have haphazard combinations they
are called noises, and are perceived by the ear as such. On the other hand,
the musical tones are made up of combination of frequencies which have integer
ratios to the one with the lowest frequency amongst them. The one with the
lowest frequency is called the Fundamental Frequency, and those with the higher
frequencies, the Overtones or Harmonics. The Fundamental Frequency is what
is registered as the Pitch of a compound sound in our auditory centers. An
octave of a sound is another sound whose frequency is twice of that of the
first one. The musical tones, seemingly in all cultures and through out
the whole history, are expressed by notes which occupy different positions
along one continuum of frequencies, and then the same order is repeated
in the first octave, and the octaves after that. This is because our ear
perceives an octave of a sound some how the same as the first one,
and there are indications that this mechanism applies even to some species
other than humans also (Reference 1) . If a complex sound,
but its Fundamental Frequency filtered out (removed), is introduced to human
ear, the ear would still perceive that frequency as if it actually were
present. This phenomenon is so well known that in loud speaker making industry
is exploited to give the impression of a low frequency sound when it is
technically hard to actually produce that frequency (i. e. producing a low
frequency tone by a small loud speaker).
Before getting into my hypothesis,
a brief review of the anatomy and physiology of ear is in order:
Basically our ear is made up of 3 parts
(references 2 and 13): 1-the OUTER EAR which
consists of the auricle and the ear canal. The auricle is responsible for
collecting the sound vibrations and directing them in to the ear canal.
The ear canal at the end of which the tympanic membrane is situated transfers
those vibrations to the middle ear. 2- The MIDDLE EAR starts from the tympanic
membrane, and transmits the received vibrations in to the third part, i.
e. the inner ear, through three tiny bones which together function as a
lever in order to augment the amplitudes of those vibrations. 3-The INNER
EAR is grossly made up of two parts, one related to body's balance, which
does not concern us in this discussion, and the other one for hearing. The
sensory part of this section is the Organ of Corti. This part (references 3 and 4) to which the auditory
nerve is connected, is situated in the Ductus Cochlearis (Cochlear Partition) which
has the shape of a three sided prism and has a membrane as its base side
called the Basilar Membrane. Ductus Cochlearis is a part of the whole Cochlea,
and with the rest of that organ turns over itself like a snail, and around
a central axis called Modiolus, about two and half times. The Organ of Corti
has two rows of sensory cells called the Inner Hair Cells and the Outer
Hair Cells and they are situated on inner and outer edges of Basilar Membrane
respectively. It was Hermann von Helmholz (1821-1894) who propounded the
idea that the 24'000 or so fibers of the Basilar Membrane are arranged like
the strings of a harp or a piano, and those that are at the base or the
beginning of it (starting from the middle ear), and are shorter and more
taut, resonate with the higher frequencies, and those which are situated
farther away and are longer, resonate with the lower frequencies, as proven
by the laws of physics. Later on George von Bekesy (1899-1972) expanded
on Helmholz theory and proposed the Traveling Wave theory (references 4 and also
14), and carried
out intricate experiments on cadaver ears. According to this theory, with
any sound the whole of the Basilar Membrane comes in to vibration but the
area which corresponds to the frequency of that sound vibrates at the maximum
amplitude. The hair cells which are the sensory cells, receive these vibrations
and transform them in to electrical energies that then are picked up by
the auditory nerve fibers. The basic mechanism of Bekesy and Helmholz theory
is called the Place Theory, meaning that the perception of pitch depends
on where the stimulus is coming from, much as when we can tell that it
is our toe which is being stepped upon without even looking at it. The Tunning
Curves of the auditory nerve fibers which represent the individual fibers
of the Basilar Membrane (reference 4) are the
proof of this theory. How ever, the responses of those fibers are not that
specific and while each of those fibers is most sensitive to the stimulation
at a specific frequency, with increasing the intensity of the stimulus,
it does respond to other frequencies too. That frequency to which any of
those fibers is most sensitive to is called the Characteristic Frequency
(CF), or Best Frequency (BF), but as we see the Place Theory is not quite
sufficient to explain the perception of pitch by the brain.
The other theories can all
be lumped in to Temporal Theory, and that means that the pitch which the
brain perceives depends on the frequency of the sound itself (or on how
fast the stimulus travels in a unit of time). The clinical observations(reference 5) , and the lab
researches too, prove this theory also:The individual auditory nerve
fibers, respond to sinusoid vibrations by increasing their rate of
firing, and for the lower frequencies, i. e. up to 600/second or
so, this firing is on a one to one basis, and this is called Phase
Locking. For higher frequencies, more and more fibers come in to play,
and this is called the Volley Theory. How ever, this process can only
work for frequencies of up to 4000/second(reference 4 again),
and does not explain the perception of all the vibrations to which our ears are
sensitive to, which are up to 20'000/second. The cells in auditory receiving
centers not only change their firing rates with the frequency of the sound
(reference 6), as do
the auditory nerve fibers, but they are also sensitive to changes of the
amplitudes of the vibrations (amplitude modulations, or AM), and change
their firing rates accordingly (reference 7). Further
more, if the order of this amplitude modulation is changed, again those
cells detect it and change their firing rates accordingly (references 8 and 11). It is conceivable
that the brain uses both place and time representations (reference 12), and possibly
combined with some other as yet unknown process (or processes), to code the
received stimuli, not neccessarily on a one to one basis, but as a set
of comparative values, in order to differentiate the frequencies and the
other qualities of sounds.
Now we discuss how does the brain
choose the lowest frequency (Fundamental Frequency) as the pitch of the
received compound sound, and why do the octaves of a sound, sound more like
the same than the other harmonics like the 2/3rds and 3/4ths and so forth.
What I am proposing here is based
on an anatomic fact of the Organ of Corti. That anatomic fact is the peculiar
innervation of the hair cells in Organ of Corti:The auditory nerve has about
30'000 -40'000 fibers, the Inner Hair cells which are arranged in a single
row are about 3'500 in number, and the Outer Hair cells which are arranged
in rows of 3 (and occasionally 4) are about 12'000 in number. The peculiar fact is that, the inner hair
cells, although much less numerous than the outer ones, receive about 95%
of the auditory nerve fibers, and that is because each Inner
Hair Cell receives average of 20 nerve fibers, while on the other hand,
each 20 rows of the outer
hair cells receive only one fiber from auditory nerve.
What I am concluding from
this arrangement is this:Since each Inner Hair Cell is situated on the inner
side of a fiber of the Basilar Membrane, on the outer end of which a row
of Outer Hair Cell is also located, and so both types of cells receive the
same vibrations, the frequency received by each Inner Hair Cell is connected
to 20 other frequencies up and down the Basilar Membrane through this arrangement.
On the other hand, since
each 20 Outer Hair Cells are supplied only by one nerve filament, which
ends on the same exact point of a nerve cell of the Spiral Ganglion of Corti
inside the inner ear, in here, some how a conditioning (condition
reflexing) takes place, and then the received informations are pooled and
sorted out and categorized in higher up centers. The Fundamental Frequency
mentioned above, being the common denominator, and having the most repeated inter-spike
interval (the interval between the two successive firings of the nerve)
as shown eloquently by Cariani (reference 9 , also see
reference 15),
will be picked up and registered by the auditory centers as the PITCH of that
complex sound. After all this, we see that the perception of pitch does not
neccessarily need the presence of the actual energy in the Fundamental Frequency
(which probably many times does not even exist) but the ability of the brain
to detect the periodicity of the incoming compound stimulus, and register
it as the pitch of that stimulus. Now we come to what is so special about
the harmonics, and especially the octaves:The harmonics of a sound
are the integer multiples of it, and as such they have a regular and constant
phase relationship to that sound, i. e.: their waves never interfere with
each other and their peaks keep the same fixed constant intervals, and because
of this they will have re-enforcing conditioning effect on each other each
time they are repeated. The octaves have even more of such an effect because
not only their phases with the sound of the reference are constant and regular,
but also have an exact one to one relationship.
To better explain this matter
I will make a visual comparison:If an observer, sitting in a moving train,
look out side at the supposedly passing electric poles on one side, which
are erected for example at distances of 100 meters from each other, and
of brown colour, and at the same side of the road also green colour poles
are erected next to them, starting from the same point but with 50 meter
distances from each other, and again on the same side of the road red colour
poles too are erected but with a 4/3 relationship to brown ones numberwise
(each 75 meters apart), and starting from the same point, the green ones
which are occuring at perfectly constant and regular intervals of 50 meters
distances to the brown ones (except when they overlap), will have a more
re-enforcing rhythm effect than the red ones which their distances to the
brown ones keep changing from 25 meters to 50 meters (again except when
they overlap). To further explain the matter I am referring the reader also
to see the Harmony Wheel of Boomsliter and Creel (page 46 of reference 10).
Now about the pathology of
hearing losses : What I believe happens in the
old age or disease, in addition to the decreased general sensitivity which
could be helped by increasing the sound intensity, is that:1- Due to degeneration
of the across and the up and down connecting fibers between the Outer and
Inner Hair Cells, the communication between these two sets of sensory cells is
lost, and the brain, deprived of its tools for collecting and comparing
and sorting out the received data, will not be able to detect the Fundamental Frequency
of the incoming sound in order to register its pitch and identify it as
such in its memory bank, and so even a perfectly rhythmic sound would be
perceived as a noise, and hence the decreased Signal to Noise Ratio (S/N
ratio).
2- The outer hair cells, also due
to degeneration and loss of their ability to contract or lengthen (Reference 13), will
not be able to regulate the amplitudes of the vibrations of the Basilar
Membrane, and this by itself will cause further impairment of pitch identification.
This is because the frequency and intensity are not that independent of
each other as we would like to think. By studying the pattern of the Basilar
Membrane vibrations in reference 14 we see that by
increasing the intensity of the sound, the point of maximum vibration, i.
e. the point of the Best Frequency, moves toward the higher frequency region,
i. e. toward the base of the Basilar Membrane. On the other hand, inTransfer
Action of the Middle Ear (Reference 16), we see
that the amount of gain in the sound intensity by the lever action of the
3 little bones is also dependent on the frequency of the sound and has a
cut off point as we shall see. The reason for this interdependence of the
frequency and the intensity in both conditions can be conceptualized that
each and every single vibration has its own minute amount of energy, and
the more vibration landing on Basilar Membrane, whether through the middle
ear, or directly as in Reference 14, the more
will be the energy pile up, and hence the intensity of it. It is also worth
mentioning that inTransfer Action of the Middle Ear, the cut off point mentioned
above, according to different investigators and the methods and subjects
they have used, is reported at the frequencies from 1000 to 3000 per second.
What this cut off point means is that above that specific frequency, there
will not be any increase in power by the lever action of the ossicles. The
reason for this is that the movements of the 3 little ossicles, and especially
the last one (the stapes) in middle ear, are efficient only up to that point,
and beyond that, their vibrations would be so fast that they actually will
not be able to transfer any real lever function to the fluid chamber of
the inner ear. This is like some one trying to scoop out soup from a pot
in to one's plate, and by increasing the speed beyond a certain range the
scoop will not have enough time to fill itself up first. How ever, in the
case of increased sound intensities by artificial means, such as the industrial
noises, there is no such a built in control, and we can easily destroy the
Basilar Membrane and its cells. The maximum point of the impact of this
man made damage seems to be around the 4000 frequency region since this
is the area of the audiogram which usually shows the first dip, and that will
keep on deepening and widening if the assaults continue.
As for what can be done to help the brain over come
the afformentioned disabilities I am not sure myself but trying to find
the problem is always the first step.
In the first mentioned disability,
introducing the Fundamental
Frequency artificially and separately, and as a re-enforcement even if actual
energy already do exist in that frequency, along with the hearing aid, might
not help the decreased brain's ability to organize the received data around
it but carrying out such a study surely won't hurt. For the second mentioned
disability the possibility of making the gain controls of hearing aids
more specifically frequency dependent could be investigated.
Finally there is a word about
the way we test the hearings of the people:Why always and universally only
one tone and its octaves (125/second, 250/second, 500/second and . . .
.) are used for this purpose? It is conceivable that with this method we
might be losing some of the information we need for evaluating one's hearing,
and I believe a more comprehensive tone system could be more helpful. Farhang
Bakhtiar M. D.