Speech Enhancement: Concept and Methodology
Speech Enhancement: Concept and Methodology
Background
s(n)
y(n)
signal processing
s(n)
d(n)
noise
References: J. S. Lim and A. V. Oppenheim, "Enhancement and Bandwidth Compression of Noisy Speech,", Proc. IEEE, vol. 67, No. 2, pp. 1586-1604, Dec. 1979. J. H. L. Hansen and M. A. Clements, "Constrained Iterative Speech Enhancement with Application to Speech Recognition," IEEE Trans. Signal Processing, vol. 39, No. 4, pp. 795-805, Apr. 1991.
Applications: improving speech quality or intelligibility in multimedia and wireless communications communications between pilot and air traffic control tower speech recognition
Speech Enhancement Depends on: good signal processing technique human perceptual factor, speech quality and intelligibility are dependent on short term spectral amplitude and insensitive to spectral phase
Processing of Speech Signal: speech is stationary over a short period of time (10ms to 20ms) frame by frame processing
window
y(n)
N N
ym(n)
ym+1(n)
0n N 1
Methods: Spectral Subtraction Wiener Filtering Iterative Wiener Filtering Improved Iterative Wiener Filtering Constrained Iterative Wiener Filtering
Spectral Subtraction
Subtracting noise power spectrum from noisy signal power spectrum Assumption: noise power spectral density (PSD) Pd () is known (Pd () = E[ | D() |2 ] ) Concept:
ym (n) = sm (n) + dm (n) Ym () = Sm () + Dm () | Ym () |2 =| Sm () |2 + | Dm () |2 +Sm ()Dm ()* + Sm ()*Dm () | Sm () |2 + | Dm () |2 | S m () |2| Ym () |2 Pd () S m () =| S m () | Ym () s m (n) = F1{S m ()}
Block Diagram:
noise frame?
ym(n)
estimate Pd (w)
Pd (w)
FFT
phase
-1
| Ym (w) |
FFT
s m(n)
1/2
| Sm(w) | 2
Wiener Filtering
Concept:
ym(n)
h(n)
s m(n)
Ym(w)
1
H(w) =
Ps (w) Py (w)
Pd () Pd ()
Wiener
filter
minimizes
optimum
in
the
Block Diagram:
noise frame?
ym(n)
estimate Py (w)
H(w) =
s m(n)
Algorithm:
P s ()0 = P y () P d () i =0
repeat
P s ()i H()i = P s ()i + P d () Sm ()i + 1 = H()i Ym () P s ()i + 1 =| Sm ()i + 1 |2
Block Diagram:
noise frame?
ym(n)
H(w) =
iterate
Ps (w)
2
FFT
Block Diagram:
noise frame?
ym(n)
estimate Pd (w)
H(w) =
iterate
compute LPCs
Ps (w) =
1+
iai
e -jwi
Demo: Clean Speech Speech + Noise Processed by Improved Iterative Wiener Filtering
apply a priori speech characteristics to impose interframe and interframe constraints on the speech spectrum
Summary
Speech enhancement is important in human to human or human to machine communications Two classes of speech enhancement methods: spectral subtraction and Wiener filtering Wiener filtering is an optimum filter in the mean-square error sense Wiener filtering, assuming known signal and noise spectra, gives an upper bound in performance Imposing constraints from speech production model and speech characteristics produce better signal spectrum estimation and hence improve performance