Automatic Speech Processing Homework 3: Pitch Determination and Cepstral Analysis - Prof. , Assignments of Electrical and Electronics Engineering

Instructions and problems for homework 3 of the eel6586: automatic speech processing course. The assignment covers topics such as pitch determination using autocorrelation of lpc residuals, cepstral analysis, and computer analysis of speech. Students are required to write a program for automatic pitch analysis and submit plots of pitch vs. Time for three recorded sentences.

Typology: Assignments

Pre 2010

Uploaded on 09/17/2009

koofers-user-dhl
koofers-user-dhl 🇺🇸

4

(2)

10 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EEL6586: Automatic Speech Processing HW#3
EEL 6586: HW#3
Assignment is due Friday, February 22, 2008 in class. Late home-
work loses e#of days late 1percentage points.
PART A: Short Answer (No more than a few sentences each)
A1 Assume that a speech signal was framed with a 25ms rectangular win-
dow. What is the main lobe width in Hz due to the rectangular window?
Recall that the main lobe width appears in the Fourier magnitude spec-
trum of speech as the width of each pitch harmonic. Assume a 20KHz
sampling rate.
A2 A common algorithm for pitch determination is to perform autocorre-
lation on the LPC residual (error) and look for peaks. Why does this
algorithm work better than performing autocorrelation on the original
speech signal and looking for peaks?
A3 The following synthetic sound is created:
s(t) = sin(2π(200Hz)t)+0.5 sin(2π(400Hz)t)+0.25 sin(2π(500Hz)t)
What is the likely pitch frequency we would perceive in listening to this
sound? Explain.
A4 Compute the complex cepstrum of
H(z) = 1/(1 + az1)
Assume |a|<1.
A5 In class, the cepstrum was defined as the inverse Fourier Transform
of the log of the Fourier Transform. However, some people define the
cepstrum as the Fourier Transform of the log of the Fourier Transform
(without the inverse). Which version do you expect to perform better
in actual applications? Explain.
J.G. Harris February 11, 2008 1
pf3

Partial preview of the text

Download Automatic Speech Processing Homework 3: Pitch Determination and Cepstral Analysis - Prof. and more Assignments Electrical and Electronics Engineering in PDF only on Docsity!

EEL 6586: HW#

Assignment is due Friday, February 22, 2008 in class. Late home- work loses e#^ of^ days late^ − 1 percentage points.

PART A: Short Answer (No more than a few sentences each)

A1 Assume that a speech signal was framed with a 25ms rectangular win- dow. What is the main lobe width in Hz due to the rectangular window? Recall that the main lobe width appears in the Fourier magnitude spec- trum of speech as the width of each pitch harmonic. Assume a 20KHz sampling rate.

A2 A common algorithm for pitch determination is to perform autocorre- lation on the LPC residual (error) and look for peaks. Why does this algorithm work better than performing autocorrelation on the original speech signal and looking for peaks?

A3 The following synthetic sound is created:

s(t) = sin(2π(200Hz)t) + 0.5 sin(2π(400Hz)t) + 0.25 sin(2π(500Hz)t)

What is the likely pitch frequency we would perceive in listening to this sound? Explain.

A4 Compute the complex cepstrum of

H(z) = 1/(1 + az−^1 )

Assume |a| < 1.

A5 In class, the cepstrum was defined as the inverse Fourier Transform of the log of the Fourier Transform. However, some people define the cepstrum as the Fourier Transform of the log of the Fourier Transform (without the inverse). Which version do you expect to perform better in actual applications? Explain.

PART B: Textbook problems (Use Matlab only to optionally check your work)

B1 Derive an exact value for the height of the first side band of the rect- angular window. Make whatever assumptions you feel necessary.

B2 Compute the real cepstrum of

H(z) = 1/(1 + az−^1 )

Assume |a| < 1.

B3 Compute the complex cepstrum of the following causal filter

H(z) =

1 + 18 z−^3

B4 The cepstral coefficients of a recorded speech signal x(n) are given by xˆ(n). How do these cepstral coefficients change when a pre-emphasis factor of (1 −. 96 z−1)^ is applied to the speech creating a modified signal y(n)? Write an equation for ˆy(n).

B5 Euclidean distance in complex cepstral space can be related to a RMS log spectral distance measure. Assuming that

log S(ω) =

n=+∑∞

n=−∞

cne−jnω

where S(ω) is the power spectrum (magnitude-squared Fourier trans- form), prove the following: n=+∑∞

n=−∞

(cn − c′ n)^2 =

2 π

∫ | log(S(ω)) − log(S′(ω))|^2 dω

where S(ω) and S′(ω) are the power spectra for two different signals.