MPEG Audio Compression: Psychoacoustics, Critical Bands, and Masking in ECE160 Lecture 14, Study notes of Electrical and Electronics Engineering

An overview of mpeg audio compression, focusing on psychoacoustic concepts such as dynamic range, equal-loudness relations, threshold of hearing, and frequency masking. The text also covers critical bands and the bark unit. The lecture is from ece160 / cmps182, a multimedia course offered in spring 2007.

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-84t-1
koofers-user-84t-1 🇺🇸

10 documents

1 / 35

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECE160
Spring 2007
Lecture 14
MPEG Audio Compression
1
ECE160 / CMPS182
Multimedia
Lecture 14: Spring 2007
MPEG Audio Compression
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23

Partial preview of the text

Download MPEG Audio Compression: Psychoacoustics, Critical Bands, and Masking in ECE160 Lecture 14 and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

ECE Lecture 14 1

ECE160 / CMPS

Multimedia

Lecture 14: Spring 2007

MPEG Audio Compression

ECE Lecture 14 2

Psychoacoustics

  • The range of human hearing is about

20 Hz to about 20 kHz

  • The frequency range of the voice is typically only

from about 500 Hz to 4 kHz

  • The dynamic range, the ratio of the maximum

sound amplitude to the quietest sound that

humans can hear, is on the order of about 120 dB

ECE Lecture 14 4

Threshold of Hearing

  • Threshold of human hearing, for pure tones: if a sound is above the dB level shown then the sound is audible
  • Turning up a tone so that it equals or surpasses the curve means that we can then distinguish the sound
  • An approximate formula exists for this curve:

ECE Lecture 14 5

Frequency Masking

  • Lossy audio data compression methods, such as MPEG/Audio encoding, do not encode some sounds which are masked anyway
  • The general situation in regard to masking is as follows:
    1. A lower tone can effectively mask (make us unable to hear) a higher tone
    2. The reverse is not true - a higher tone does not mask a lower tone well
    3. The greater the power in the masking tone, the wider is its influence - the broader the range of frequencies it can mask.
    4. As a consequence, if two tones are widely separated in frequency then little masking occurs

ECE Lecture 14 7

Frequency Masking Curve

ECE Lecture 14 8

Frequency Masking Curve

ECE Lecture 14 10

Critical Bands and Bandwidth

ECE Lecture 14 11

Bark Unit

  • Bark unit is defined as the width of one critical

band, for any masking frequency

  • The idea of the Bark unit: every critical band

width is roughly equal in terms of Barks

ECE Lecture 14 13 Temporal and Frequency Masking

ECE Lecture 14 14 Temporal and Frequency Masking

  • For a masking tone that is played for a longer time,

it takes longer before a test tone can be heard.

Solid curve: masking tone played for 200 msec;

Dashed curve: masking tone played for 100 msec.

ECE Lecture 14 16

MPEG Layers

  • MPEG audio offers three compatible layers :
    • Each succeeding layer able to understand the lower layers
    • Each succeeding layer offering more complexity in the psychoacoustic model and better compression for a given level of audio quality
    • Each succeeding layer, with increased compression effectiveness, accompanied by extra delay
  • The objective of MPEG layers: a good tradeoff

between quality and bit-rate

ECE Lecture 14 17

MPEG Layers

  • Layer 1 quality can be quite good - provided a

comparatively high bit-rate is available

  • Digital Audio Tape typically uses Layer 1 at around 192 kbps
  • Layer 2 has more complexity; was proposed for

use in Digital Audio Broadcasting

  • Layer 3 (MP3) is most complex,

and was originally aimed at audio transmission

over ISDN lines

  • Most of the complexity increase is at the

encoder, not the decoder - accounting for the

popularity of MP3 players

ECE Lecture 14 19

MPEG Audio Strategy

  • Frequency masking : by using a psychoacoustic

model to estimate the just noticeable noise level:

  • Encoder balances the masking behavior and the available number of bits by discarding inaudible frequencies
  • Scaling quantization according to the sound level that is left over, above masking levels
  • May take into account the actual width of the

critical bands:

  • For practical purposes, audible frequencies are divided into 25 main critical bands
  • For simplicity, adopts a uniform width for all frequency analysis filters, using 32 overlapping subbands

ECE Lecture 14 20 MPEG Audio Compression Algorithm