Nyquist Sampling Theorem - Midterm Exam | CS 598 | Exams Computer Science

CS598KN Midterm Solutions

Audio:

1: The Nyquist Sampling Theorem states that “If a signal is sampled at a rate

higher than twice the highest significant signal frequency, then the samples

contain all the information of the original signal. “ In order to fully reconstruct

signals that are audible by human ear (frequencies of up to 20khz), 2x20khz

needs to be used. CD sound is sampled at slightly more than twice the frequency

to make up for possible imprecisions.

2: Hummed songs are converted into a string of characters U, D, and S (Up,

Down, and Same), that represents the sequence of relative differences in pitch.

Songs in the audio database are pre-computed in the same way, thus converting

the problem of audio matching into string matching. The string-matching

algorithm allows k mismatches.

3: The psycho-acoustic model attempts to account for how humans actually

perceive sound. As such, it translates the physical properties of sound

(frequency, level, and duration) into measures perceived by humans (critical

band rate, loudness, and subjective duration). The critical-band-rate scale follows

a linear frequency scale up to 500 Hz, and then a logarithmic frequency scale

above 500khz. Masking effect, which can happen both in frequency and time

domains, is related to the model because it explains how we hear – certain

sound masks others, and therefore, the masked sounds become acoustically

irrelevant.

4: MPEG audio compression makes heavy use of psycho-acoustic model to

remove acoustically irrelevant part of the audio signal, in order to achieve high

compression rate. In MPEG audio compression, the audio signal is first divides

into 32 frequency sub-bands. Then the psycho-acoustic model analyzes the

amount of masking for each sub-band: If the energy in a band is below the

masking threshold, then it is not encoded. Otherwise, it is allocated a number of

bits to represent the coefficient. Bit allocation is determined by the signal-to-mask

ratio (SMR – ratio of the signal energy to the minimum masking threshold) in

such a way that sub-bands with small SMRs are allocated more bits and those

with large SMRs are allocated less bits.

5: HAS is heavily used in MPEG audio compression and immersive audio

systems.

In MPEG audio compression, the psycho-acoustic model is used to determine

inaudible sound signal due to masking effects, and allocate bits judiciously based

on the signal-to-mask ratio. The use of the psycho-acoustic model is to achieve

high compression rate while preserving the audible quality as much as possible.

Nyquist Sampling Theorem - Midterm Exam | CS 598, Exams of Computer Science