# Search in the document preview

Audio Representation and Processing

Docsity.com

**Fundamentals of Audio Signals
**

• Two signals of different amplitudes • A greater amplitude represents a louder sound.

Docsity.com

**Fundamentals of Audio Signals
**

• Two signals of different frequencies • A greater frequency represents a higher pitched sound.

Docsity.com

**Fundamentals of Audio Signals
**

• Any sound, no matter how complex, can be represented by a waveform.

• For complex sounds, the waveform is built up by the superposition of less complex waveforms

• The component waveforms can be discovered by applying the Fourier Transform – Converts the signal to the frequency domain – Inverse Fourier Transform converts back to the time domain

Docsity.com

Sampling

• Sounds can be thought of as functions of a single

variable (*t*) which must be sampled and quantized
• The *sampling rate* is given in terms of samples per

second, or, kHz – During the sampling process, an analog signal is sampled at discrete

intervals – At each interval, the signal is momentarily “held” and represents a

measurable voltage rate

Docsity.com

Quantization

• Audio is usually quantized at between 8 and 20

bits – Voice data is usually quantized at 8 bits – Professional audio uses 16 bits – Digital signal processors will often use a 24 or 32

bit structure internally

Docsity.com

Quantization

• The accuracy of the digital encoding can be approximated by considering the word length per sample

• This accuracy is known as the *signal-to-error
ratio* (S/E) and is given by:
– S/E = 6*n* + 1.8 dB
– *n* is the number of bits per sample

Docsity.com

Quantization

• When a coarse quantization is used, it may be useful to add a high-frequency signal (analog white noise) to the signal before it is quantized – This will make the coarse quantization less perceptible

when the signal is played back
– This technique is known as *dithering*

• During the sampling process, an analog signal is sampled at discrete intervals

• At each interval, the signal is momentarily “held” and represents a measurable voltage rate

Docsity.com

Channels

• We may also have audio data coming from more than
one *channels
*

• Data from a multichannel source is usually interleaved

• Sampling rates are always measured per channel – Stereo data recorded at 8000 samples/second will actually

generate 16,000 samples every second

Docsity.com

Digital Audio Data

• A complete description of digital audio data includes (at least): – sampling rate; – number of bits per sample; – number of channels (1 for mono, 2 for stereo, etc.) – Type of quantization (linear, logarithmic, etc.)

Docsity.com

Analog to Digital Conversion

• *Nyquist’s theorem* states that if an arbitrary
signal has been run through a low-pass filter of
bandwidth *H*, the filtered signal can be
completely reconstructed by taking only 2*H*
(exact) samples per second.

• So, a low-pass filter is placed before the
sampling circuitry of the *analog-to-digital
(A/D) converter*.

Docsity.com

Analog to Digital Conversion

• If frequencies greater than the Nyquist limit enter the
digitization process, an unwanted condition called
*aliasing* occurs

• The low-pass filter used will require the use of a gradual high-frequency roll-off, thus a sampling rate somewhat higher than twice the Nyquist limit is often used

• A/D conversion may make use of a successive approximation register (SAR)

Docsity.com

Analog to Digital Conversion

• The low-pass filter can cause side effects. – One way that these side effects can be overcome is through

the use of *oversampling* - a signal-processing function that
raises the sample rate of a digitally encoded signal.

– Consumer and professional 16-bit D/A converters often use up to 8- and 12-times oversampling, raising the sampling rate of a CD (for example) from 44.1 kHz to 352.8 kHz or 529.2 kHz.

– By altering the signal’s noise characteristics, it is possible to shift much of the overall bandwidth noise out of the range of human hearing.

Docsity.com

Pulse Code Modulation

• The method that has been discussed for storing
audio is known as *pulse code modulation* (PCM).

1 5 14 12 5

Analog Input

0 0 0 1 0 1 0 1 1 1 1 0 1 1 0 0 0 1 0 1

Transmitted Code

Docsity.com

Pulse Code Modulation

• PCM is common in long-distance telephone lines. – The analog signal (voice) is sampled at 8000

samples/second with 7 or 8 bits per sample – A T1 carrier handles 24 voice channels multiplexed

together – The bandwidth of this type of carrier can be calculated as

follows: • 8 bits x 8000 samples/second x 24 channels = 1.544 Mbps

– Note that one out of 8 bits is for control, not data.

Docsity.com

Pulse Code Modulation

• D/A conversion process – parallelize the serial bit stream – generate an analog voltage analogous to the

voltage level at the original time of sampling – An output sample and hold circuit is used to

minimize spurious signal glitches – a final low-pass filter is inserted into the path

• Smooths out the non-linear steps introduced by digital sampling

Docsity.com

Pulse Code Modulation

• Other PCM topics: – mu-law and A-law companding – DPCM – DM – ADPCM

Docsity.com

Digital Signal Processing

• Processing of a digital signal to achieve special effects may generally be described in terms of some simple functions: – Addition – Multiplication – Delay – Resampling

Docsity.com

Digital Signal Processing

• *Addition* of two signals is accomplished by adding the
sample values of the signals at each sampling point:
*h(t)=f(t)+g(t)
*– We can add as many signals as desired together

• *Multiplication* of a given signal is represented as:
*g(t)=m*f(t)*, where *m* is the multiplication factor.
– Multiplication is used to increase or decrease the gain

(loudness) of a signal. If *m>1*, *g* is louder than *f*. If *m<1*, *g*
is less loud than *f*

– Note that when adding signals together or multiplying by a number greater than one, care must be taken when the signal reaches the upper limit of the sample size

Docsity.com

Digital Signal Processing

• *Delay* is an important effect described as follows:
*g(t)=f(t+d)*, where *d* is a delay time
– Use delay and addition to model echo:

• *f(t)* = HELLO
• *g(t)* = *f(t *+ *d1)* , where 0 <*d1*
• *g(t)* = HELLO
• *h(t)* = *f(t* + *d2)* , where 0 <*d1* < *d2*
• *h(t)* = HELLO
• *F(t)* = *f(t)* + *g(t)* + *h(t)*
• = HELLO HELLO HELLO

Docsity.com

Digital Signal Processing

• Now consider a more realistic echo effect. We
need to make each succeeding echo softer. We
can do this with multiplication.
– *g’(t)* =* m*g(t)**h’(t)* = *n*h(t)*, *0<n<m<1
*– *F’(t) *= f(t) + g’(t) + h’(t)

= HELLO HELLO HELLO

Docsity.com

Digital Signal Processing

• When delays of 35-40 ms and greater are used, the listener perceives them as discrete delays

• Reducing the delay to the 15-35 ms range will create delays that are too closely spaced to be perceived as discrete delays – When used with instruments, the brain is fooled into

thinking that more instruments are playing than there actually are

– combining several short term delay modules that are
slightly detuned in time, an effect known as *chorusing* can
be achieved (used by guitarists, e.g.)

Docsity.com

Pitch-Related Effects

• DSP functions are available that can alter the speed and pitch of an audio program. These can: – Change pitch without changing duration – Change duration without changing pitch – Change both duration and pitch

• The process for raising and lowering the pitch of a sample is shown on the next slides

Docsity.com

Pitch-Related Effects The original waveform Resample at 1/2 the original sample rate

1/2 the samples are droppedNow raise the outgoing rate

Docsity.com

Pitch-Related Effects The original waveform Sample interpolation

Drop the sampling rate back down to the original rate

Docsity.com