0% found this document useful (0 votes)
48 views21 pages

Speech Enhancement: Concept and Methodology

This document discusses speech enhancement techniques. It describes how speech enhancement aims to recover the original speech signal from a noisy observation. The key methods covered are spectral subtraction and Wiener filtering. Spectral subtraction directly subtracts an estimated noise spectrum from the noisy speech spectrum. Wiener filtering applies an optimal filter to minimize mean-square error based on estimated signal and noise power spectral densities, and can be improved using iterative and constrained techniques that leverage speech production models.

Uploaded by

Hade Karimata
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views21 pages

Speech Enhancement: Concept and Methodology

This document discusses speech enhancement techniques. It describes how speech enhancement aims to recover the original speech signal from a noisy observation. The key methods covered are spectral subtraction and Wiener filtering. Spectral subtraction directly subtracts an estimated noise spectrum from the noisy speech spectrum. Wiener filtering applies an optimal filter to minimize mean-square error based on estimated signal and noise power spectral densities, and can be improved using iterative and constrained techniques that leverage speech production models.

Uploaded by

Hade Karimata
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

SPEECH ENHANCEMENT: CONCEPT AND METHODOLOGY

Presented by Dominic K. C. Ho Demo prepared by Tong Wang University of Missouri-Columbia

Background

Problem: recover s(n) from y(n) = s(n) + d(n)

s(n)

y(n)
signal processing

s(n)

d(n)

noise

References: J. S. Lim and A. V. Oppenheim, "Enhancement and Bandwidth Compression of Noisy Speech,", Proc. IEEE, vol. 67, No. 2, pp. 1586-1604, Dec. 1979. J. H. L. Hansen and M. A. Clements, "Constrained Iterative Speech Enhancement with Application to Speech Recognition," IEEE Trans. Signal Processing, vol. 39, No. 4, pp. 795-805, Apr. 1991.

Applications: improving speech quality or intelligibility in multimedia and wireless communications communications between pilot and air traffic control tower speech recognition

Speech Enhancement Depends on: good signal processing technique human perceptual factor, speech quality and intelligibility are dependent on short term spectral amplitude and insensitive to spectral phase

Processing of Speech Signal: speech is stationary over a short period of time (10ms to 20ms) frame by frame processing

window

y(n)

N N

N=frame size m=frame number

ym(n)

ym+1(n)

ym (n) = sm (n) + dm (n) ,

0n N 1

Methods: Spectral Subtraction Wiener Filtering Iterative Wiener Filtering Improved Iterative Wiener Filtering Constrained Iterative Wiener Filtering

Spectral Subtraction

Subtracting noise power spectrum from noisy signal power spectrum Assumption: noise power spectral density (PSD) Pd () is known (Pd () = E[ | D() |2 ] ) Concept:
ym (n) = sm (n) + dm (n) Ym () = Sm () + Dm () | Ym () |2 =| Sm () |2 + | Dm () |2 +Sm ()Dm ()* + Sm ()*Dm () | Sm () |2 + | Dm () |2 | S m () |2| Ym () |2 Pd () S m () =| S m () | Ym () s m (n) = F1{S m ()}

Block Diagram:

noise frame?

ym(n)

estimate Pd (w)

Pd (w)

FFT
phase
-1

| Ym (w) |

FFT
s m(n)

1/2

| Sm(w) | 2

Demo: Clean Speech Speech + Noise Processed by Spectral Subtraction

Wiener Filtering

Concept:
ym(n)
h(n)

s m(n)

Ym(w)
1

Sm(w) =H(w) Y (w) m

H(w) =

Ps (w) Py (w)

H() weights spectrum according to SNR at different frequencies

H() weights spectrum according to SNR at different frequencies


1 , Ps () Ps () Ps () = H() = P y () Ps () + Pd () 0 , Ps ()

Pd () Pd ()

Wiener

filter

minimizes

E[{sm (n) s m (n)} 2] ,

optimum

in

the

mean-square error sense Approximation:


P y () = | Ym () |2 over noisy speech segments P d () = | Ym () |2 over noise segments P s () = P y () P d ()

Demo: Clean Speech Speech + Noise Processed by Ideal Wiener Filtering

Block Diagram:

noise frame?

ym(n)

estimate Pd (w) Pd (w) Py (w)

estimate Py (w)

H(w) =

Py (w) - Pd (w) Py (w)

s m(n)

Demo: Clean Speech Speech + Noise Processed by Wiener Filtering

Iterative Wiener Filtering


Estimating P s () by P y () P d () may not be good Can do better by computing P s () from the Wiener filter output

Algorithm:
P s ()0 = P y () P d () i =0

repeat
P s ()i H()i = P s ()i + P d () Sm ()i + 1 = H()i Ym () P s ()i + 1 =| Sm ()i + 1 |2

i=i+1 until convergence

Block Diagram:

noise frame?

ym(n)

estimate Pd (w) Pd (w)

H(w) =

Ps (w) Ps (w) + Pd (w)


s m(n)

iterate
Ps (w)
2

FFT

Improved Iterative Wiener Filtering

Improve estimation of P s () using constraint from speech production

model Linear prediction model of speech


s(n) = a1s(n 1) a2s(n 2) a p s(n P) + (n) Ps () = 2 | 1 + P= 1 ai e ji |2 i

Block Diagram:
noise frame?

ym(n)

estimate Pd (w)

H(w) =

Ps (w) Ps (w) + Pd (w) sm(n)

iterate

compute LPCs

Ps (w) =
1+

iai

e -jwi

Demo: Clean Speech Speech + Noise Processed by Improved Iterative Wiener Filtering

Constrained Iterative Wiener Filtering

apply a priori speech characteristics to impose interframe and interframe constraints on the speech spectrum

Summary

Speech enhancement is important in human to human or human to machine communications Two classes of speech enhancement methods: spectral subtraction and Wiener filtering Wiener filtering is an optimum filter in the mean-square error sense Wiener filtering, assuming known signal and noise spectra, gives an upper bound in performance Imposing constraints from speech production model and speech characteristics produce better signal spectrum estimation and hence improve performance

You might also like