Head-Related
Transfer Function (HRTF) Research at the FIU DSP Laboratory
by Navarun Gupta and Carlos Ordonez
The human beings perceive sound in three dimensions. Localization of sound depends on the
way the sound waves from the same source differ from each other as they reach the left and
right ear. The head, torso, shoulders and the outer ears modify the sound arriving at a
person's ears. This modification can be described by a complex response function - the
Head Related Transfer Function (HRTF).
HRTFs can be used to generate binaural sound. Theoretically, HRTFs contain all the
information about the sound source's location (its direction and distance from the
listener). If properly measured and implemented, HRTFs can generate a "virtual
acoustic environment".
The study of HRTFs is a rapidly growing area with potential uses in virtual environments,
auditory displays, entertainment industry, human-computer interface for visually impaired,
aircraft warning systems and many others.
Problems with HRTF:
Measuring HRTF's can be expensive. A typical set up requires an anechoic chamber and high
quality audio equipment like speakers and headphones. To take this technology to the
masses, generic HRTFs have been used, but they do not work as well as individualized
HRTFs. Once measured, HRTFs are convolved with the sound to give it a direction. Depending
on the size of these functions, the cost of computing equipment can rise significantly.
There is much to be learned about HRTFs. Even the most carefully taken measurements suffer
from the "cones of confusion" and "inside the head" effects. Range
cues are poorly understood. It is possible to add a room transfer function to give the
effect of distance, but such filters are not flexible, i.e. one cannot obtain a
"whisper" effect using the room transfer function.
Sound Localization Research at the FIU DSP Lab:
The DSP Lab at FIU has available an AUSIM HeadZap system to measure individual HRTFs. This
system measures 128-point impulse responses of sounds generated by Golay Codes. We measure
72 pairs of HRTFs for every individual (12 azimuths and 6 elevations) which can be
analyzed to find out how the HRTF changes from person to person. So far, data from 40
individuals are being analyzed and we have found some interesting patterns (see below for
papers).
Using MATLAB, the HRTFs are convolved with a monoaural sound to produce binaural signals.
To measure the accuracy of localization, a GUI is used to test about 20 individuals. More
testing may reveal how the ear shape affects localization ability.
Once a pattern is established, the HRTFs can be modified to make them more effective. We
are currently pursuing the identification of the pattern of frequency attenuation and
enhancement that seems to depend on the protrusion angle of the pinna.
Analysis of individual HRTFs may also reveal how it changes with the shape of the head,
torso and pinna. Using this information, one may be able to individualize generic transfer
functions. We are looking at building such a model based on a spherical-head model (the
most basic model). We plan to add a pinna reflection model to this and finally a pole/zero
model to make it increase the accuracy of the overall model.
Papers:
1) Evaluation of digital sound spatialization accuracy over commodity
audio channels in a personal computer
Omar Grafals, Navarun Gupta, Gualberto Cremades, Barreto, A. B., and Malek
Adjouadi.
Proceedings of the 1999 Computing Research Conference, University of Puerto Rico -
Mayaguez, December 4, 1999, Mayaguez, PR., pp. 5-8.
2) Decreased 3-D sound spatialization accuracy caused by speech bandwidth limitations
over commodity audio components.
Omar Grafals, Navarun Gupta, Gualberto Cremades, Barreto, A. B., and Malek
Adjouadi. Biomedical Sciences Instrumentation, Vol.36, April 2000, pp. 245 250.
3) The effect of pinna protrusion angle on localization of virtual sound in the
horizontal plane
Some Results from the work reported in (1) and (2), above:
Testing the accuracy of virtual sound localization within the front
hemisphere, at 0 elevation, using generic HRTFs, we found that the accuracy of
localization decreases significantly outside the range from 45 to +45 degrees in
azimuth. This main effect was observed for the spatialization of both, a speech signal and
broadband noise.
Some Results from the work reported in (3) above:
Chart shows how a subject's resolution of front/back confusion
improves with the prototype HRTF (with large protrusion angle).
Some related web links:
http://sound.media.mit.edu/KEMAR.html
http://www.cc.gatech.edu/gvu/multimedia/spatsound/spatsound.html
http://www-engr.sjsu.edu/~duda/Duda.Research.html
http://www.waisman.wisc.edu/hdrl/index.html
http://www.pa.msu.edu/acoustics/
|