Using HTK
Using HTK
Chi-Yueh Lin
2006/07/17
HTK
2 ' ( $ 3 ) * + , - 4 5 6 7 8 9 :
; < = > ? @ A B C
–
–
(Speech Corpus)
(Transcription/Label files)
–
! "
HTK
D E F G
– HCopy – # $ % &
– HInit & HRest –
(Label)
– HCompV & HERest -
(Transcription)
– HHEd – ' ( ) * + ,
+ H F G
– HParse – - . / 0 1 2 3 4
– HVite –
HCopy - abstract
SAVECOMPRESSED=F
SAVEWITHCRC=F 7
HCopy – config file (1)(2)
NATURALREADORDER=TRUE
NATURALWRITEORDER=TRUE
–
TRUE
SOURCEFORMAT=NOHEAD
–
SOURCEKIND=WAVEFORM
–
TARGETKIND=MFCC_E_D_A
– MFCC energy (E), delta (D), delta-delta (A)
– SOURCEFORMAT HTK
HCopy – config file (3)
SOURCERATE=1250
– ! " # $ " 0.125 ms
TARGETRATE=100000
– % & " # 10 ms
WINDOWSIZE=250000
– % & ' ( 25 ms
DHTKE
F G = H I 100ns
HCopy – config file (4)
ZMEANSOURCE=T
– ) * + zero mean, - . DC/
USEHAMMING=T
– 0 1 Hamming Window
PREEMCOEF=0.97
– 2 3 4 5 6 0.97
HCopy – config file (5)
NUMCHANS=31
– 7 Mel 8 ( 9 : ; < 31=> ?
USEPOWER=F
– @ 0 1 c(0)A 6
NUMCEPS=13
– B C 0 1 13DMFCCA 6
HCopy – config file (6)
ENORMALISE=T
– % E F G /H I J K
LOFREQ=200
– > ? 9 L M > N
LOFREQ
HIFREQ=3500 0HIFREQ
– > ? O L M > N 8000
DELTAWINDOW=2
ACCWINDOW=2
– delta P delta-delta Q R A 6
HCopy – config file (7)
SAVECOMPRESSED=F
– HTKS T U V W X A 6 False
SAVEWITHCRC=F
– HTKS 7V W A 6 C Y Z O CRC[ \ ]
False
HCopy – script file
o p q G r s o t l sourceu t l target
HCopy - Usage
1 2 3 4 5
v w x K
HMM - definition
J ` a b HMM
~o J ` observation
<VecSize> 39
– ^ G _ ( 39
<MFCC_0_D_A>
– A 6 ` a b hcopy
config c d e f
~h “proto” K ! HMM
prototype cHMM
_ d K e protof - . g
h
HMM - definition
o p k l
o p k l
1. 0.0 1.0
2. 0.0
3. ! " # $ % & ' ( ) * + , - . / 0 . 01 2
HMM - Training
2 D E 4 5 8 z t
c B
– Label :
0 180000 sil
180000 450000 voc
450000 610000 voc
– Transcription :
sil
voc
voc
HMM - Training
t Label c
– Hinit + HRest q r
t Transcription c
– HCompV + HERest q r
q u ?
– s Hinit t HRest u v j Label Y w
HCompV t HERest u v Transcription Y
HMM - Training
HMM - Training
HCompV
– D HTK F x I flat start
– 5 y ' : & e B z w { | A e } A Mean
t Variance
HMM - Training
HCompV
– $ HCompV -C config -S script -M dir1 -l aa -o aa
-I label.mlf proto
– config ~
SOURCEFORMAT=HTK
SOURCEKIND=MFCC_E_D_A
– script h
Y
– -l aa –o aa { aaHMMY : aa
– -M dir1 q `
Y
– -I q ` master label file M –L q `
– proto s J ` HMM prototype
HMM - Training
HERest
– $ HERest -C config -S script -I label.mlf -d dir1
-M dir2 hmmlist
-I g master label fileh S 1 –L g i
-d j k HMMc i (lHCompVV W X )
-M g m n o HMM i
hmmlist d p q r HMMs t
– cDdir2- . A e HMM Macro File
HMM – mixture incrementing
HHEd
– V split.hed
– MU 16 {*.state[2-4].mix}
u =HMMv w x y (2~4); z < 16=mixture
– MU 8 {aa.state[2-4].mix}
aaHMMv w x y (2~4); z < 8=mixture
– $ HHed -M mix2 -w newHMM -d mix1 split.hed
-M mix2 ; z { HMM n 7 mix2 i
-w newHMM ; z { HMMs
-d mix1 7mix1 i j k | } HMM
HMM – mixture incrementing
HHEd
– 1 - . g h newHMMM
e mixture J : ? y
HERest & R? ¡ ¢ £ ¤ ¤ ¥ ¦ I§ ¨
– ¥ ¦
© ! ª « HMM
K G $ Z
IHMM.model
Recognition – Dictionary & WordNet
Dictionary P 2 d e HTK 8 + ,
* g ? * g HMM
V .
phone.dic
HMM
3 4 5 6 ai7 8 9 : ; ai
SYM_1 SYM_2 … SYM_N SYM_SHOW
Recognition – Dictionary & WordNet
WordNet d + , r (
Recognition – Dictionary & WordNet
WordNet
– ¬ WordNet, M ® ¯
– ° ± WordNet, HParse q r
$ci_phone = phone.syn
a |
ai |
an ;
( SENT-START < $ci_phone [sil] > SENT-END ) a
WordNetDictionary 9 : _ P
HVite + ,
– $ HVite –C config –w phone.net –H HMM.model
phone.dic hmmlist XXX.wav
i j XXX.rec c & + ,
0 700000 sil -428.991882
700000 1100000 b -276.103149
1100000 1700000 t -493.964966
1700000 3000000 weng -1099.999023
Recognition - WaveSurfer
Recognition - WaveSurfer
XXX.wav XXX.txt
 Ã
WordNet
HMM Model HMM List
Dictionary