www. Fbakhtiar. Com





Octaves' Perception as a Window in to Hearing Physiology



I, as a practicing ENT physician, have yet to see any one using a hearing aid, unlike the eye glass users, quite satisfied with his or her hearing device. I believe the reason for this is that the knowledge of the hearing physiology still has some short comings, and that is why I undertook this study. I have tried to make this writing as simple for the non- professionals to understand as possible, and for that I have included some basic explanations which might sound boring to the professionals. I also have enclosed the texts of my references for any one who might want to study them. Farhang Bakhtiar M.D., Nov. 2006

Sounds are produced by vibration of objects, and each simple sound has a frequency which determines its pitch. Compound sounds are made up of superimposition of few or more simple sounds, each with its own specific frequency, and what the ear hears is the mixture of all of them. When these frequencies have haphazard combinations they are called noises, and are perceived by the ear as such. On the other hand, the musical tones are made up of combination of frequencies which have integer ratios to the one with the lowest frequency amongst them. The one with the lowest frequency is called the Fundamental Frequency, and those with the higher frequencies, the Overtones or Harmonics. The Fundamental Frequency is what is registered as the Pitch of a compound sound in our auditory centers. An octave of a sound is another sound whose frequency is twice of that of the first one. The musical tones, seemingly in all cultures and through out the whole history, are expressed by notes which occupy different positions along one continuum of frequencies, and then the same order is repeated in the first octave, and the octaves after that. This is because our ear perceives an octave of a sound some how the same as the first one, and there are indications that this mechanism applies even to some species other than humans also (Reference 1) . If a complex sound, but its Fundamental Frequency filtered out (removed), is introduced to human ear, the ear would still perceive that frequency as if it actually were present. This phenomenon is so well known that in loud speaker making industry is exploited to give the impression of a low frequency sound when it is technically hard to actually produce that frequency (i. e. producing a low frequency tone by a small loud speaker).

Before getting into my hypothesis, a brief review of the anatomy and physiology of ear is in order:

Basically our ear is made up of 3 parts (references 2 and 13): 1-the OUTER EAR which consists of the auricle and the ear canal. The auricle is responsible for collecting the sound vibrations and directing them in to the ear canal. The ear canal at the end of which the tympanic membrane is situated transfers those vibrations to the middle ear. 2- The MIDDLE EAR starts from the tympanic membrane, and transmits the received vibrations in to the third part, i. e.  the inner ear, through three tiny bones which together function as a lever in order to augment the amplitudes of those vibrations. 3-The INNER EAR is grossly made up of two parts, one related to body's balance, which does not concern us in this discussion, and the other one for hearing. The sensory part of this section is the Organ of Corti. This part (references  3 and 4) to which the auditory nerve is connected, is situated in the Ductus Cochlearis (Cochlear Partition) which has the shape of a three sided prism and has a membrane as its base side called the Basilar Membrane. Ductus Cochlearis is a part of the whole Cochlea, and with the rest of that organ turns over itself like a snail, and around a central axis called Modiolus, about two and half times. The Organ of Corti has two rows of sensory cells called the Inner Hair Cells and the Outer Hair Cells and they are situated on inner and outer edges of Basilar Membrane respectively. It was Hermann von Helmholz (1821-1894) who propounded the idea that the 24'000 or so fibers of the Basilar Membrane are arranged like the strings of a harp or a piano, and those that are at the base or the beginning of it (starting from the middle ear), and are shorter and more taut, resonate with the higher frequencies, and those which are situated farther away and are longer, resonate with the lower frequencies, as proven by the laws of physics. Later on George von Bekesy (1899-1972) expanded on Helmholz theory and proposed the Traveling Wave theory (references 4 and also 14), and carried out intricate experiments on cadaver ears. According to this theory, with any sound the whole of the Basilar Membrane comes in to vibration but the area which corresponds to the frequency of that sound vibrates at the maximum amplitude. The hair cells which are the sensory cells, receive these vibrations and transform them in to electrical energies that then are picked up by the auditory nerve fibers. The basic mechanism of  Bekesy and Helmholz theory is called the Place Theory, meaning that the perception of pitch depends on where the stimulus is coming  from, much as when we can tell that it is our toe which is being stepped upon without even looking at it. The Tunning Curves of the auditory nerve fibers which represent the individual fibers of the Basilar Membrane (reference 4) are the proof of this theory. How ever, the responses of those fibers are not that specific and while each of those fibers is most sensitive to the stimulation at a specific frequency, with increasing the intensity of the stimulus, it does respond to other frequencies too. That frequency to which any of those fibers is most sensitive to is called the Characteristic Frequency (CF), or Best Frequency (BF), but as we see the Place Theory is not quite sufficient to explain the perception of pitch by the brain.

The other theories can all be lumped in to Temporal Theory, and that means that the pitch which the brain perceives depends on the frequency of the sound itself (or on how fast the stimulus travels in a unit of time). The clinical observations(reference 5) , and the lab researches too, prove this theory also:The individual auditory nerve fibers, respond to sinusoid vibrations by increasing their rate of firing, and for the lower frequencies, i. e.  up to 600/second or so, this firing is on a one to one basis, and this is called Phase Locking. For higher frequencies, more and more fibers come in to play, and this is called the Volley Theory. How ever, this process can only work for frequencies of up to 4000/second(reference 4 again), and does not explain the perception of all the vibrations to which our ears are sensitive to, which are up to 20'000/second. The cells in auditory receiving centers not only change their firing rates with the frequency of the sound (reference 6), as do the auditory nerve fibers, but they are also sensitive to changes of the amplitudes of the vibrations (amplitude modulations, or AM), and change their firing rates accordingly (reference  7). Further more, if the order of this amplitude modulation is changed, again those cells detect it and change their firing rates accordingly (references 8 and 11). It is conceivable that the brain uses both place and time representations (reference 12), and possibly combined with some other as yet unknown process (or processes), to code the received stimuli, not neccessarily on a one to one basis,  but as a set of comparative values, in order to differentiate the frequencies and the other qualities of sounds.

Now we discuss how does the brain choose the lowest frequency (Fundamental Frequency) as the pitch of the received compound sound, and why do the octaves of a sound, sound more like the same than the other harmonics like the 2/3rds and 3/4ths and so forth.

What I am proposing here is based on an anatomic fact of the Organ of Corti. That anatomic fact is the peculiar innervation of the hair cells in Organ of Corti:The auditory nerve has about 30'000 -40'000 fibers, the Inner Hair cells which are arranged in a single row are about 3'500 in number, and the Outer Hair cells which are arranged in rows of 3 (and occasionally 4) are about 12'000 in number. The peculiar fact is that, the inner hair cells, although much less numerous than the outer ones, receive about 95% of the auditory nerve fibers, and that is because each Inner Hair Cell receives average of 20 nerve fibers, while on the other hand, each 20 rows of the outer hair cells receive only one fiber from auditory nerve.

What I am concluding from this arrangement is this:Since each Inner Hair Cell is situated on the inner side of a fiber of the Basilar Membrane, on the outer end of which a row of Outer Hair Cell is also located, and so both types of cells receive the same vibrations, the frequency received by each Inner Hair Cell is connected to 20 other frequencies up and down the Basilar Membrane through this arrangement. On the other hand, since each 20 Outer Hair Cells are supplied only by one nerve filament, which ends on the same exact point of a nerve cell of the Spiral Ganglion of Corti inside the inner ear, in here, some how a conditioning (condition reflexing) takes place, and then the received informations are pooled and sorted out and categorized in higher up centers. The Fundamental Frequency mentioned above, being the common denominator, and having the most repeated inter-spike interval (the interval between the two successive firings of the nerve) as shown eloquently by Cariani (reference 9 , also see reference 15), will be picked up and registered by the auditory centers as the PITCH of that complex sound. After all this, we see that the perception of pitch does not neccessarily need the presence of the actual energy in the Fundamental Frequency (which probably many times does not even exist) but the ability of the brain to detect the periodicity of the incoming compound stimulus, and register it as the pitch of that stimulus. Now we come to what is so special about the harmonics, and especially the octaves:The harmonics of a sound are the integer multiples of it, and as such they have a regular and constant phase relationship to that sound, i. e.: their waves never interfere with each other and their peaks keep the same fixed constant intervals, and because of this they will have re-enforcing conditioning effect on each other each time they are repeated. The octaves have even more of such an effect because not only their phases with the sound of the reference are constant and regular, but also have an exact one to one relationship.

To better explain this matter I will make a visual comparison:If an observer, sitting in a moving train, look out side at the supposedly passing electric poles on one side, which are erected for example at distances of 100 meters from each other, and of brown colour, and at the same side of the road also green colour poles are erected next to them, starting from the same point but with 50 meter distances from each other, and again on the same side of the road red colour poles too are  erected but with a 4/3 relationship to brown ones numberwise (each 75 meters apart), and starting from the same point, the green ones which are occuring at perfectly constant and regular intervals of 50 meters distances to the brown ones (except when they overlap), will have a more re-enforcing rhythm effect than the red ones which their distances to the brown ones keep changing from 25 meters to 50 meters (again except when they overlap). To further explain the matter I am referring the reader also to see the Harmony Wheel of Boomsliter and Creel (page 46 of reference 10).

Now about the pathology of hearing losses : What I believe happens in the old age or disease, in addition to the decreased general sensitivity which could be helped by increasing the sound intensity, is that:1- Due to degeneration of the across and the up and down connecting fibers between the Outer and Inner Hair Cells, the communication between these two sets of sensory cells is lost, and the brain, deprived of its tools for collecting and comparing and sorting out the received data, will not be able to detect the  Fundamental Frequency of the incoming sound in order to register its pitch and identify it as such in its memory bank, and so even a perfectly rhythmic sound would be perceived as a noise, and hence the decreased Signal to Noise Ratio (S/N ratio).

2- The outer hair cells, also due to degeneration and loss of their ability to contract or lengthen (Reference 13), will not be able to regulate the amplitudes of the vibrations of the Basilar Membrane, and this by itself will cause further impairment of pitch identification. This is because the frequency and intensity are not that independent of each other as we would like to think. By studying the pattern of the Basilar Membrane vibrations in reference 14 we see that by increasing the intensity of the sound, the point of maximum vibration, i. e.  the point of the Best Frequency, moves toward the higher frequency region, i. e.  toward the base of the Basilar Membrane. On the other hand, inTransfer Action of the Middle Ear (Reference 16), we see that the amount of gain in the sound intensity by the lever action of the 3 little bones is also dependent on the frequency of the sound and has a cut off point as we shall see. The reason for this interdependence of the frequency and the intensity in both conditions can be conceptualized that each and every single vibration has its own minute amount of energy, and the more vibration landing on Basilar Membrane, whether through the middle ear, or directly as in Reference 14, the more will be the energy pile up, and hence the intensity of it. It is also worth mentioning that inTransfer Action of the Middle Ear, the cut off point mentioned above, according to different investigators and the methods and subjects they have used, is reported at the frequencies from 1000 to 3000 per second. What this cut off point means is that above that specific frequency, there will not be any increase in power by the lever action of the ossicles. The reason for this is that the movements of the 3 little ossicles, and especially the last one (the stapes) in middle ear, are efficient only up to that point, and beyond that, their vibrations would be so fast that they actually will not be able to transfer any real lever function to the fluid chamber of the inner ear. This is like some one trying to scoop out soup from a pot in to one's plate, and by increasing the speed beyond a certain range the scoop will not have enough time to fill itself up first. How ever, in the case of increased sound intensities by  artificial means, such as the industrial noises, there is no such a built in control, and we can easily destroy the Basilar Membrane and its cells. The maximum point of the impact of this man made damage seems to be around the 4000 frequency region since this is the area of the audiogram which usually shows the first dip, and that will keep on deepening and widening if the assaults continue.

As for what can be done to help the brain over come the afformentioned disabilities I am not sure myself but trying to find the problem is always the first step.

In the first mentioned disability, introducing the Fundamental Frequency artificially and separately, and as a re-enforcement even if actual energy already do exist in that frequency, along with the hearing aid, might not help the decreased brain's ability to organize the received data around it but carrying out such a study surely won't hurt. For the second mentioned disability the possibility of making the gain controls of hearing aids more specifically frequency dependent could be investigated.

Finally there is a word about the way we test the hearings of the people:Why always and universally only one tone and its octaves  (125/second, 250/second, 500/second and . . . .) are used for this purpose? It is conceivable that with this method we might be losing some of the information we need for evaluating one's hearing, and I believe a more comprehensive tone system could be more helpful. Farhang Bakhtiar M. D.