Psychoacoustics
From Wikipedia, the free encyclopedia
This article needs additional citations for verification. Please help improve this article by adding reliable references (ideally, using inline citations). Unsourced material may be challenged and removed. (April 2008) |
Psychoacoustics is the study of subjective human perception of sounds. Alternatively it can be described as the study of the psychological correlates of the physical parameters of acoustics.
Contents |
[edit] Background
Hearing is not a purely mechanical phenomenon of wave propagation, but is also a sensory and perceptual event. When a person hears something, that something arrives at the ear as a mechanical sound wave traveling through the air, but within the ear it is transformed into neural action potentials. These nerve pulses then travel to the brain where they are perceived. Hence, in many problems in acoustics, such as for audio processing, it is advantageous to take into account not just the mechanics of the environment, but also the fact that both the ear and the brain are involved in a person’s listening experience.
The inner ear, for example, does significant signal processing in converting sound waveforms into neural stimulus, so certain differences between waveforms may be imperceptible.[1] MP3 and other audio compression techniques make use of this fact.[2] In addition, the ear has a nonlinear response to sounds of different loudness levels. Telephone networks and audio noise reduction systems make use of this fact by nonlinearly compressing data samples before transmission, and then expanding them for playback.[3] Another effect of the ear's nonlinear response is that sounds that are close in frequency produce phantom beat notes, or intermodulation distortion products.[4]
[edit] Limits of perception
The human ear can nominally hear sounds in the range 20 Hz to 20,000 Hz (20 kHz). This upper limit tends to decrease with age, most adults being unable to hear above 16 kHz. The ear itself does not respond to frequencies below 20 Hz, but these can be perceived via the body's sense of touch. Some recent research has also demonstrated a hypersonic effect which is that although sounds above 20 kHz cannot consciously be heard, they can have an effect on the listener.[5]
Frequency resolution of the ear is 0.36 Hz within the octave of 1,000–2,000 Hz. That is, changes in pitch larger than 0.36 Hz can be perceived in a clinical setting.[6] However, even smaller pitch differences can be perceived through other means. For example, the interference of two pitches can often be heard as a (low-)frequency difference pitch. This effect of phase variance upon the resultant sound is known as 'beating'.
The semitone scale used in Western musical notation is not a linear frequency scale but logarithmic. Other scales have been derived directly from experiments on human hearing perception, such as the mel scale and Bark scale (these are used in studying perception, but not usually in musical composition), and these are approximately logarithmic in frequency at the high-frequency end, but nearly linear at the low-frequency end.
The "intensity" range of audible sounds is enormous. Our ear drums are sensitive only to variations in the sound pressure, but can detect pressure changes as small as 2×10–10 atm and as great or greater than 1 atm. For this reason, Sound Pressure Level is also measured logarithmically, with all pressures referenced to 1.97385×10–10 atm. The lower limit of audibility is therefore defined as 0 dB, but the upper limit is not as clearly defined. While 1 atm (191 dB) is the largest pressure variation an undistorted sound wave can have in Earth's atmosphere, larger sound waves can be present in other atmospheres, or on Earth in the form of shock waves. The upper limit is more a question of the limit where the ear will be physically harmed or with the potential to cause a hearing disability. This limit also depends on the time exposed to the sound. The ear can be exposed to short periods in excess of 120 dB without permanent harm — albeit with discomfort and possibly pain; but long term exposure to sound levels over 80 dB can cause permanent hearing loss.
A more rigorous exploration of the lower limits of audibility determines that the minimum threshold at which a sound can be heard is frequency dependent. By measuring this minimum intensity for testing tones of various frequencies, a frequency dependent Absolute Threshold of Hearing (ATH) curve may be derived. Typically, the ear shows a peak of sensitivity (i.e., its lowest ATH) between 1 kHz and 5 kHz, though the threshold changes with age, with older ears showing decreased sensitivity above 2 kHz.
The ATH is the lowest of the equal-loudness contours. Equal-loudness contours indicate the sound pressure level (dB), over the range of audible frequencies, which are perceived as being of equal loudness. Equal-loudness contours were first measured by Fletcher and Munson at Bell Labs in 1933 using pure tones reproduced via headphones, and the data they collected are called Fletcher-Munson curves. Because subjective loudness was difficult to measure, the Fletcher-Munson curves were averaged over many subjects.
Robinson and Dadson refined the process in 1956 to obtain a new set of equal-loudness curves for a frontal sound source measured in an anechoic chamber. The Robinson-Dadson curves were standardized as ISO 226 in 1986. In 2003, ISO 226 was revised as equal-loudness contour using data collected from 12 international studies.
[edit] Masking effects
In some situations an otherwise clearly audible sound can be masked by another sound. For example, conversation at a bus stop can be completely impossible if a loud bus is driving past. This phenomenon is called masking. A weaker sound is masked if it is made inaudible in the presence of a louder sound. The masking phenomenon occurs because any loud sound will distort the Absolute Threshold of Hearing, making quieter, otherwise perceptible sounds inaudible.
If two sounds occur simultaneously and one is masked by the other, this is referred to as simultaneous masking. Simultaneous masking is also sometimes called frequency masking. The tonality of a sound partially determines its ability to mask other sounds. A sinusoidal masker, for example, requires a higher intensity to mask a noise-like maskee than a loud noise-like masker does to mask a sinusoid. Computer models which calculate the masking caused by sounds must therefore classify their individual spectral peaks according to their tonality.
Similarly, a weak sound emitted soon after the end of a louder sound is masked by the louder sound. Even a weak sound just before a louder sound can be masked by the louder sound. These two effects are called forward and backward temporal masking, respectively.
[edit] 'Phantom' fundamentals
Low pitches can sometimes be heard when there is no apparent source or component of that frequency. This perception is due to the brain interpreting repetition patterns determined by the differences of audible harmonics that are present.[7] A harmonic series of pitches that are related 2×f, 3×f, 4×f, 5×f, etc, give human hearing the psychoacoustic impression that the pitch 1×f is present. This phenomenon is used by some pro audio manufacturers to allow sound systems to seem to produce notes that are lower in pitch than they are capable of reproducing.[8][9]
[edit] Software
The psychoacoustic model provides for high quality lossy signal compression by describing which parts of a given digital audio signal can be removed (or aggressively compressed) safely - that is, without significant losses in the (consciously) perceived quality of the sound.
It can explain how a sharp clap of the hands might seem painfully loud in a quiet library, but is hardly noticeable after a car backfires on a busy, urban street. This provides great benefit to the overall compression ratio, and psychoacoustic analysis routinely leads to compressed music files that are 1/10 to 1/12 the size of high quality original masters with very little discernible loss in quality. Such compression is a feature of nearly all modern audio compression formats. Some of these formats include MP3, Ogg Vorbis, AAC, WMA, MPEG-1 Layer II (used for digital audio broadcasting in several countries) and ATRAC, the compression used in MiniDisc and Walkman.
Psychoacoustics is based heavily on human anatomy, especially the ear's limitations in perceiving sound as outlined previously. To summarize, these limitations are:
Given that the ear will not be at peak perceptive capacity when dealing with these limitations, a compression algorithm can assign a lower priority to sounds outside the range of human hearing. By carefully shifting bits away from the unimportant components and toward the important ones, the algorithm ensures that the sounds a listener is most likely to perceive are of the highest quality.
[edit] Music
Psychoacoustics include topics and studies which are relevant to music psychology. Theorists such as Benjamin Boretz consider some of the results of psychoacoustics to be meaningful only in a musical context.
[edit] Applied psychoacoustics
Psychoacoustics is presently applied within many fields from software development, where developers map proven and experimental mathematical patterns; in digital signal processing, where many audio compression codecs such as MP3 use a psychoacoustic model to increase compression ratios; in the design of (high end) audio systems for accurate reproduction of music in theatres and homes; as well as defense systems where scientists have experimented with limited success in creating new acoustic weapons, which emit frequencies that may impair, harm, or kill (see [1]). It is also applied today within music, where musicians and artists continue to create new auditory experiences by masking unwanted frequencies of instruments, causing other frequencies to be enhanced. Yet another application is in design of small or lower-quality loudspeakers, which use the phenomenon of missing fundamentals to give the effect of low frequency bass notes that the system, due to frequency limitations, cannot actually reproduce (see references).
[edit] See also
- A-weighting, a commonly used perceptual loudness transfer function
- Audio compression
- Auditory illusions
- Auditory scene analysis incl. 3D-sound perception, localisation
- Bark scale, Equivalent rectangular bandwidth (ERB), Mel scale and other scales
- Perception of non-existent sounds, such as missing fundamental frequency and other auditory illusions. Compare to telephone which transmits 300 Hz to 3400 Hz
- Equal-loudness contour
- Haas effect
- Language processing
- Loudness, that is, perceived volume, Bel, sone
- Mozart effect
- Musical tuning
- Noise health effects
- Psycholinguistics
- Rate-distortion theory
- Sound localization
- Sound of fingernails scraping chalkboard
- Source separation
- Sound masking
- Speech recognition
- Timbre
[edit] References
[edit] Footnotes
- ^ Christopher J. Plack (2005). The Sense of Hearing. Routledge. ISBN 0805848843. http://books.google.com/books?id=DoGzm3soUoMC&pg=PA65&dq=ear+hearing+cochlea++inauthor:plack&lr=&as_brr=3&ei=z0emSN2LJo3sswO7g-2dBQ&sig=ACfU3U1lfPTX-igjhSgGUD6eObrQlcqL7g.
- ^ Lars Ahlzen, Clarence Song (2003). The Sound Blaster Live! Book. No Starch Press. ISBN 1886411735. http://books.google.com/books?id=tKO-truWww8C&pg=PA310&dq=mp3++imperceptible+ear&lr=&as_brr=3&ei=gUimSMP9D5fUtAP0yp2eBQ&sig=ACfU3U3eupVEYqdtBT-_7tLrD-572cA7HQ.
- ^ Rudolf F. Graf (1999). Modern dictionary of electronics. Newnes. ISBN 0750698667. http://books.google.com/books?id=o2I1JWPpdusC&pg=PA137&dq=compression+expansion+noise-reduction+telephone&lr=&as_brr=3&ei=p0mmSMb5Joa2tgOvzqGeBQ&sig=ACfU3U3vnf20ljMFnFneQlWnYGk8SuxwGQ.
- ^ Jack Katz, Robert F. Burkard, and Larry Medwetsky (2002). Handbook of Clinical Audiology. Lippincott Williams & Wilkins. ISBN 0683307657. http://books.google.com/books?id=Aj6nVIegE6AC&pg=PA43&dq=beat+distortion++ear&lr=&as_brr=3&ei=8EumSM3oIIOEswP0-IieBQ&sig=ACfU3U3m4oRu5h6MU3zsvfeZjzabodf_8g.
- ^ http://www.cco.caltech.edu/~boyk/spectra/spectra.htm
- ^ Olson, Harry F. (1967). Music, Physics and Engineering. Dover Publications. pp. 248–251. ISBN 0486217698. http://books.google.com/books?id=RUDTFBbb7jAC.
- ^ Colin Yallop and Janet Fletcher (2007). An Introduction to Phonetics and Phonology. Blackwell Publishing. ISBN 1405130830. http://books.google.com/books?id=dX5P5mxtYYIC&pg=PA233&dq=phantom-fundamental+pitch+perception&lr=&as_brr=0&ei=ESCaR_m9DIfgswPHlMx9&sig=tbYP69o6YD3EPOqE-SOynLfMdhg.
- ^ Waves Car Audio. MaxxBass Bass Enhancement Technology
- ^ US patent Method and system for enhancing quality of sound signal 5930373
[edit] Notations
- E. Larsen and R.M. Aarts (2004), Audio Bandwidth extension. Application of Psychoacoustics, Signal Processing and Loudspeaker Design., J. Wiley.
- E. Larsen and R.M. Aarts (2002), Reproducing low-pitched signals through small loudspeakers, J. Audio Eng. Soc., March, 50 (3), pp. 147-164.
- T. Oohashi, N. Kawai, E. Nishina, M. Honda, R. Yagi, S. Nakamura, M. Morimoto, T. Maekawa, Y. Yonekura, and H. Shibasaki. The role of biological system other than auditory air-conduction in the emergence of the hypersonic effect http://dx.doi.org/10.1016/j.brainres.2005.12.096. Brain Research, 1073:339–347, February 2006.
[edit] External links
- The musical ear - Perception of sound
- Applied psychoacoustics in space flight - Simulation of free field hearing by head phones
- GPSYCHO - an open source psycho-acoustic and noise shaping model for ISO based MP3 encoders.
- How audio codecs work - Psycoacoustics
|