Print This Page

4. The human auditory system

The human auditory system is a real wonder - with two compact auditive sensor organs - the ears - at either side of the head, connected with a string of high speed nerve fibres to the brain stem. The brain stem is the central connecting point of the brain - connecting the human body’s nervous system inputs (such as audio, vision) and outputs (such as muscle control) to the other areas of the brain. The brain stem redirects the audio information from the ears to a section of the brain that is specialised in audio processing: the auditory cortex.

This chapter consists of three parts: the ear anatomy - describing in particular the inner ear structure, the audio universe - presenting the limitations of the human auditory system in three dimensions, and auditory functions - describing how our auditory cortex interprets the audio information coming from the ears. The ear anatomy and the auditory functions are presented using very simplified models - only describing the rough basics of the human auditory system. For more detailed and accurate information, a list of information sources for further reading is suggested in appendix.

4.1 Ear anatomy

The ear can be seen as three parts: the outer ear, the middle ear and the inner ear - each with a dedicated function.

The outer ear consists of the ear shell (pinna) and the auditory canal. Its function is to guide air pressure waves to the middle ear- with the ear shell increasing the sensitivity of the ear to the front side of the head, supporting front/rear localisation of audio signals.

The middle ear consists of the ear drum (tympanic membrane), attached to the inner ear through a delicate bone structure (malleus, incus and stapes). The middle ear bones (ossicles) and the muscles keeping them in place are the smallest in the human body. One of the major functions of the middle ear is to ensure the efficient transfer of sound from the air to the fluids in the inner ear. If the sound were to impinge directly onto the inner ear, most of it would simply be reflected back because acoustical impedance of the air is different from that of the fluids. The middle ear acts as an impedance-matching device that improves sound transmission, reduces the amount of reflect sound and protects the inner ear from excessive sound pressure levels. This protection is actively regulated by the brain using the middle ear’s muscles to tense and un-tense the bone structure with a reaction speed as fast as 10 milliseconds. The middle ear’s connection to the inner ear uses the smallest bone in the human body: the stapes (or stirrup bone), approximately 3 millimetres long, weighing approximately 3 milligrams.

The inner ear consists of the cochlea - basically a rolled-up tube. The middle ear’s stapes connects to the cochlea’s ‘oval window’. The rolled-up tube contains a tuned membrane populated with approximately 15,500 hair cells and is dedicated to hearing. The structure also has a set of semi-circular canals that is dedicated to the sense of balance. Although the semi-circular canal system also uses hair cells to provide the brain with body balance information, it has nothing to do with audio.

For audio professionals, the cochlea is one of the most amazing organs in the human body, as it basically constitutes a very sensitive 3,500-band frequency analyser with digital outputs - more or less equivalent to a high resolution FFT analyzer.

Unrolling and stretching-out the cochlea would give a thin tube of about 3.4 centimetres length, with three cavities (scala vestibuli, and scala timpani, filled with perilymph fluid, and scala media, filled with endolymph fluid), separated by two membranes: the Reissner’s membrane and the basilar membrane. The basilar membrane has a crucial function: it is thin and stiff at the beginning, and wide and sloppy at the end - populated with approximately 3,500 sections of 4 hair cells.

Incoming pressure waves - delivered to the cochlear fluid by the stapes - cause mechanical resonances on different locations along the basilar membrane, tuned to 20 kHz at the beginning near the oval window - where the membrane is narrow and stiff, down to 20 Hz at the end where the membrane is wide and sloppy.

A row of approximately 3,500 inner hair cells (IHC’s) are situated along the basilar membrane, picking up the resonances generated by the incoming waves. The inner hair cells are spread out exponentially over the 3.4 centimetre length of the tube - with many more hair cells at the beginning (high frequencies) than at the end (low frequencies). Each inner hair cell picks up the vibrations of the membrane at a particular point - thus tuned to a particular frequency. The ‘highest’ hair cell is at 20 kHz, the ‘lowest’ at 20 Hz - with a very steep tuning curve at high frequencies, rejecting any frequency above 20 kHz. (more on hair cells on the next page)

Roughly in parallel with the row of 3,500 inner hair cells, three rows of outer hair cells (OHC’s) are situated along the same membrane. The main function of the inner hair cells is to pick up the membrane’s vibrations (like a microphone). The main function of the outer hair cells is to feed back mechanical energy to the membrane in order to amplify the resonance peaks, actively increasing the system’s sensitivity by up to 60dB(*4C).

Hair cells are connected to the brain’s central connection point - the brain stem - with a nerve string containing approximately 30,000 neurons (axons)(*4D). Neurons that transport information from a hair cell to the brain stem are called afferent neurons (or sensory neurons). Neurons that transport information from the brain stem back to (outer) hair cells are called efferent neurons (or motor neurons). Afferent and efferent connections to outer hair cells use a one-to-many topology, connecting many hair cells to the brain stem with one neuron. Afferent connections to inner hair cells (the ‘microphones’) use a many-to-one topology for hair cells tuned to high frequencies, connecting one hair cell to the brain stem with many neurons.

Zooming in to the hair cell brings us deeper in the field of neurosciences. Inner hair cells have approximately 40 hairs (stereocilia) arranged in a U shape, while outer hair cells have approximately 150 hairs arranged in a V or W shape(*4E) . The hair cells hairs float free in the cochlea liquid (endolymph) just below the tectorial membrane hovering in the liquid above them, with the tips of the largest outer hair cells’ hairs just touching it.

When the basilar membrane below the hair cell moves, the hairs brush against the tectorial membrane and bend - causing a bio-electrical process in the hair cell. This process causes neural nerve impulses (action potentials, a voltage peak of approximately 150 millivolts) to be emitted through the afferent neuron(s) that connect the hair cell to the brain stem. The nerve impulse density (the amount of nerve impulses per second) depends on how much the hairs are agitated. When there’s no vibration on a particular position on the basilar membrane, the corresponding hair cell transmits a certain amount of nerve impulses per second. When the vibration of the basilar membrane causes the hairs to bend back (excitation) and forth (inhibition), the density of nerve impulses will increase and decrease depending on the amplitude of the vibration.

The maximum firing rate that can be transported by the neurons attached to the hair cell is reported to be up to 600 nerve impulses (spikes) per second. The nerve string from the cochlea to the brain contains approximately 30,000 auditory nerve fibres - of which more than 95% are thought to be afferent - carrying information from hair cells to the brain stem(*4F). The brainstem thus receives up to 18 million nerve impulses per second, with the spike density carrying information about the amplitude of each individual hair cell’s frequency band, and the spike timing pattern carrying information about time and phase coherency. As the human auditory system has two ears, it is the brain’s task to interpret two of these information streams in real time.

Return to Top