# Audio Representation - Multimedia - Lecture Slides, Slides for Multimedia Applications. English and Foreign Languages University

35 pages
1000+Number of visits
Description
The main points are:Audio Representation, Processing, Fundamentals of Audio Signals, Two Signals, Different Amplitudes, Greater Amplitude Represents, Louder Sound, Different Frequencies, Frequency Represents, Represented
20 points
this document
Preview3 pages / 35

Audio Representation and Processing

Docsity.com

Fundamentals of Audio Signals

• Two signals of different amplitudes • A greater amplitude represents a louder sound.

Docsity.com

Fundamentals of Audio Signals

• Two signals of different frequencies • A greater frequency represents a higher pitched sound.

Docsity.com

Fundamentals of Audio Signals

• Any sound, no matter how complex, can be represented by a waveform.

• For complex sounds, the waveform is built up by the superposition of less complex waveforms

• The component waveforms can be discovered by applying the Fourier Transform – Converts the signal to the frequency domain – Inverse Fourier Transform converts back to the time domain

Docsity.com

Sampling

• Sounds can be thought of as functions of a single

variable (t) which must be sampled and quantized • The sampling rate is given in terms of samples per

second, or, kHz – During the sampling process, an analog signal is sampled at discrete

intervals – At each interval, the signal is momentarily “held” and represents a

measurable voltage rate

Docsity.com

Quantization

• Audio is usually quantized at between 8 and 20

bits – Voice data is usually quantized at 8 bits – Professional audio uses 16 bits – Digital signal processors will often use a 24 or 32

bit structure internally

Docsity.com

Quantization

• The accuracy of the digital encoding can be approximated by considering the word length per sample

• This accuracy is known as the signal-to-error ratio (S/E) and is given by: – S/E = 6n + 1.8 dB – n is the number of bits per sample

Docsity.com

Quantization

• When a coarse quantization is used, it may be useful to add a high-frequency signal (analog white noise) to the signal before it is quantized – This will make the coarse quantization less perceptible

when the signal is played back – This technique is known as dithering

• During the sampling process, an analog signal is sampled at discrete intervals

• At each interval, the signal is momentarily “held” and represents a measurable voltage rate

Docsity.com

Channels

• We may also have audio data coming from more than one channels

• Data from a multichannel source is usually interleaved

• Sampling rates are always measured per channel – Stereo data recorded at 8000 samples/second will actually

generate 16,000 samples every second

Docsity.com

Digital Audio Data

• A complete description of digital audio data includes (at least): – sampling rate; – number of bits per sample; – number of channels (1 for mono, 2 for stereo, etc.) – Type of quantization (linear, logarithmic, etc.)

Docsity.com

Analog to Digital Conversion

Nyquist’s theorem states that if an arbitrary signal has been run through a low-pass filter of bandwidth H, the filtered signal can be completely reconstructed by taking only 2H (exact) samples per second.

• So, a low-pass filter is placed before the sampling circuitry of the analog-to-digital (A/D) converter.

Docsity.com

Analog to Digital Conversion

• If frequencies greater than the Nyquist limit enter the digitization process, an unwanted condition called aliasing occurs

• The low-pass filter used will require the use of a gradual high-frequency roll-off, thus a sampling rate somewhat higher than twice the Nyquist limit is often used

• A/D conversion may make use of a successive approximation register (SAR)

Docsity.com

Analog to Digital Conversion

• The low-pass filter can cause side effects. – One way that these side effects can be overcome is through

the use of oversampling - a signal-processing function that raises the sample rate of a digitally encoded signal.

– Consumer and professional 16-bit D/A converters often use up to 8- and 12-times oversampling, raising the sampling rate of a CD (for example) from 44.1 kHz to 352.8 kHz or 529.2 kHz.

– By altering the signal’s noise characteristics, it is possible to shift much of the overall bandwidth noise out of the range of human hearing.

Docsity.com

Pulse Code Modulation

• The method that has been discussed for storing audio is known as pulse code modulation (PCM).

1 5 14 12 5

0 0 0 1 0 1 0 1 1 1 1 0 1 1 0 0 0 1 0 1

Transmitted Code

Docsity.com

Pulse Code Modulation

• PCM is common in long-distance telephone lines. – The analog signal (voice) is sampled at 8000

samples/second with 7 or 8 bits per sample – A T1 carrier handles 24 voice channels multiplexed

together – The bandwidth of this type of carrier can be calculated as

follows: • 8 bits x 8000 samples/second x 24 channels = 1.544 Mbps

– Note that one out of 8 bits is for control, not data.

Docsity.com

Pulse Code Modulation

• D/A conversion process – parallelize the serial bit stream – generate an analog voltage analogous to the

voltage level at the original time of sampling – An output sample and hold circuit is used to

minimize spurious signal glitches – a final low-pass filter is inserted into the path

• Smooths out the non-linear steps introduced by digital sampling

Docsity.com

Pulse Code Modulation

• Other PCM topics: – mu-law and A-law companding – DPCM – DM – ADPCM

Docsity.com

Digital Signal Processing

• Processing of a digital signal to achieve special effects may generally be described in terms of some simple functions: – Addition – Multiplication – Delay – Resampling

Docsity.com

Digital Signal Processing

Addition of two signals is accomplished by adding the sample values of the signals at each sampling point: h(t)=f(t)+g(t) – We can add as many signals as desired together

Multiplication of a given signal is represented as: g(t)=m*f(t), where m is the multiplication factor. – Multiplication is used to increase or decrease the gain

(loudness) of a signal. If m>1, g is louder than f. If m<1, g is less loud than f

– Note that when adding signals together or multiplying by a number greater than one, care must be taken when the signal reaches the upper limit of the sample size

Docsity.com

Digital Signal Processing

Delay is an important effect described as follows: g(t)=f(t+d), where d is a delay time – Use delay and addition to model echo:

f(t) = HELLO • g(t) = f(t + d1) , where 0 <d1g(t) = HELLO • h(t) = f(t + d2) , where 0 <d1 < d2h(t) = HELLO • F(t) = f(t) + g(t) + h(t) • = HELLO HELLO HELLO

Docsity.com

Digital Signal Processing

• Now consider a more realistic echo effect. We need to make each succeeding echo softer. We can do this with multiplication. – g’(t) = m*g(t)h’(t) = n*h(t), 0<n<m<1 F’(t) = f(t) + g’(t) + h’(t)

= HELLO HELLO HELLO

Docsity.com

Digital Signal Processing

• When delays of 35-40 ms and greater are used, the listener perceives them as discrete delays

• Reducing the delay to the 15-35 ms range will create delays that are too closely spaced to be perceived as discrete delays – When used with instruments, the brain is fooled into

thinking that more instruments are playing than there actually are

– combining several short term delay modules that are slightly detuned in time, an effect known as chorusing can be achieved (used by guitarists, e.g.)

Docsity.com

Pitch-Related Effects

• DSP functions are available that can alter the speed and pitch of an audio program. These can: – Change pitch without changing duration – Change duration without changing pitch – Change both duration and pitch

• The process for raising and lowering the pitch of a sample is shown on the next slides

Docsity.com

Pitch-Related Effects The original waveform Resample at 1/2 the original sample rate

1/2 the samples are droppedNow raise the outgoing rate

Docsity.com

Pitch-Related Effects The original waveform Sample interpolation

Drop the sampling rate back down to the original rate

Docsity.com