HMM Toolkit (HTK) : Presentation by Daniel Whiteley AME Department
HMM Toolkit (HTK) : Presentation by Daniel Whiteley AME Department
Presentation by
Daniel Whiteley
AME department
What is HTK?
The Hidden Markov Model Toolkit (HTK) is a
portable toolkit for building and manipulating
hidden Markov models. HTK is primarily used
for speech recognition research although it has
been used for numerous other applications
including research into speech synthesis,
character recognition and DNA sequencing. HTK
is in use at hundreds of sites worldwide.
What is HTK?
HTK consists of a set of library modules and tools
available in C source form. The tools provide
sophisticated facilities for speech analysis, HMM
training, testing and results analysis. The software
supports HMMs using both continuous density
mixture Gaussians and discrete distributions and
can be used to build complex HMM systems.
Basic HTK command format
●
The commands in HTK follow a basic command
line format:
HCommand [options] files
●
Options are indicated by a dash followed by the
option letter. Universal options are capital letters.
●
In HTK, it is not necessary to use file extentions,
but headers to determine their format.
Configuration files
●
As well, you can set up the configuration of HTK
modules using config files. They are implemented using
the -C option; or they can be implemented globally using
the command setenv HCONFIG myconfig where
myconfig is your own config modifications.
●
All possible configuration variables can be found in
chapter 18 of the HTK manual. However, for most of
our purposes, we only need to create a config file with
these lines:
SOURCEKIND = USER %The user defined file format (not sound)
TARGETKIND = ANON_D %Keep the file the same format.
Using HTK
●
Parts of HMM modeling
– Data Preparation
– Model Training
– Pattern Recognition
– Model Analysis
Data Preparation
●
One small problem:
–
HTK was tailored for speech recognition. Therefore, most of
the data preparation tools are for audio.
– Due to this, we need to jerry-rig our data to the HTK
parameterized data file format.
●
HTK parameter files consist of a sequence of samples
preceeded by a header. The samples are simply data
vectors, whose components are 2-byte integers or 4-byte
floating point numbers.
●
For us, these vectors will be a sequence of joint angles
received from a motion capture session.
HTK file format
●
The file begins with a 12-byte header containing
the following information:
– nSamples (4-byte int): Number of samples
– samplePeriod (4-byte int): Sample period (calculated
by multiplying the number by 100ns)
– sampleSize (2-byte): Number of bytes per vector
– parameterKind (2-byte int): Defines the type of data
●
For our purposes, either this parameter will be 0x2400,
which is the user defined parameter kind, or 0x2800, which
is the discrete case.
HMM model creation
●
In order to model the motion capture squence, we need
to create a prototype of the HMM. In this prototype, the
values of B and are arbitrary. The same is true for the
transition matrix A, save that any transition probability
you set to zero will remain as zero.
●
Models are created using a scripting language similar to
HTML.
●
As well, models in HTK have a beginning and ending
state which are non-emitting. These states are not
defined in the script.
Name of
the file
HMM Model Example
~h ''prototype'' Number of
Gaussian ... Transition
distributions matrix A
<BeginHMM>
<TransP>
Number <VectorSize> 4 <USER>
of states 0.0 0.4 0.3 0.3 0.0
<NumStates> 5 0.0 0.2 0.5 0.3 0.0
<State> 2 <NumMixes> 3 0.0 0.2 0.2 0.4 0.2
Mean <Mixture> 1 0.3
observation Sample size 0.0 0.1 0.2 0.3 0.4
vector <Mean> 4
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Covariance <Variance> 4
matrix 1.0 1.0 1.0 1.0 All the transition
diagonal <Mixture> 2 0.4 ... probabilities for
<State> 3 ... the ending state
are always zero
The distribution’s
ID and weight
Vector Quantization
●
In order to reduce computation, we can make the
HMM discreete.
●
In order to use a discreete HMM, we must first
quantize the data into a set of standard vectors.
●
Warning: in quantizing the data, error is
inheritably introduced.
●
Before quantizing the data, we must first have a
standard set of vectors, or a “vector cookbook”.
This is made with HQuant.
HQuant
●
HQuant takes the training data and uses a K-means
algorithm to evenly partition the data and find the centriods
of these partitions to create our quantization vectors (QVs).
●
A sample command: Number of You can use a
QVs for a script to list all Our cookbook
Use the configuration certain data of your will be written to
variables found in stream training files this file
config
●
Note: The reference labels and the results labels
must have different file extensions