Asr02 Signal
Asr02 Signal
Hao Tang
Signal Analysis
acoustic
features
ASR system
Signal
−0.53 −0.32 0.02 0.44 · · · 0.18
Dithering
Removing DC offset
T
1 X
y [t] = x[t] − x[i]
T
i=1
Pre-emphasis
T −1
X √
X [k] = x[t]e −i2πtk/T for k = 0, . . . , T − 1, and i = −1
t=0
T −1
X √
X [k] = x[t]e −i2πtk/T for k = 0, . . . , T − 1, and i = −1
t=0
x[0]
∗ x[1]
X [k] = e i2πk·0/T e i2πk·(T −1)/T
e i2πk·1/T ···
..
.
x[T − 1]
T −1
X √
X [k] = x[t]e −i2πtk/T for k = 0, . . . , T − 1, and i = −1
t=0
x[0]
∗ x[1]
X [k] = e i2πk·0/T e i2πk·(T −1)/T
e i2πk·1/T ···
..
.
x[T − 1]
e iθ = cos θ + i sin θ
R{v0 }
R{v1 }
R{v2 }
R{v3 }
R{v4 }
T
X −1
X [k] = x[t]e −i2πtk/T = vk∗ x
t=0
X = F{x}
Linearity
Linearity
Shift Theorem
T
X −1
Y [k] = y [t]e −i2πtk/T
t=0
T
X −1
Y [k] = y [t]e −i2πtk/T
t=0
T
X −1
= x[t − 1]e −i2πtk/T
t=0
T
X −1
Y [k] = y [t]e −i2πtk/T
t=0
T
X −1
= x[t − 1]e −i2πtk/T
t=0
T
X −1
= e i2πk/T x[t − 1]e −i2π(t−1)k/T
t=0
T
X −1
Y [k] = y [t]e −i2πtk/T
t=0
T
X −1
= x[t − 1]e −i2πtk/T
t=0
T
X −1
= e i2πk/T x[t − 1]e −i2π(t−1)k/T
t=0
i2πk/T
=e X [k]
DFT of pre-emphasis
DFT of pre-emphasis
DFT of pre-emphasis
Speech is non-stationary.
Extract spectra with a sliding window, typically with a 25ms
window size and a 10ms hop.
Display the spectra as a heat map.
DFT