0% found this document useful (0 votes)

66 views5 pages

A Speaker Independent Continuous Speech Recognizer For Amharic

This document discusses building a speaker independent continuous speech recognition system for the Amharic language using a hybrid HMM/ANN approach. The recognizer was constructed at the context dependent phoneme level using the CSLU Toolkit. The system achieved 74.28% word recognition accuracy and 39.70% sentence recognition accuracy, which were the best results reported for Amharic speech recognition at that time.

Uploaded by

Belete Belay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views5 pages

A Speaker Independent Continuous Speech Recognizer For Amharic

Uploaded by

Belete Belay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/221484299

A speaker independent continuous speech recognizer for Amharic

Conference Paper · September 2005

DOI: 10.21437/Interspeech.2005-860 · Source: DBLP

CITATIONS READS

19 475

2 authors:

Hussien Seid Björn Gambäck

1 PUBLICATION 19 CITATIONS
Norwegian University of Science and Technology
111 PUBLICATIONS 1,147 CITATIONS
SEE PROFILE
SEE PROFILE

Some of the authors of this publication are also working on these related projects:

PRESEMT View project

All content following this page was uploaded by Björn Gambäck on 29 May 2014.

The user has requested enhancement of the downloaded file.

INTERSPEECH 2005

A Speaker Independent Continuous Speech Recognizer for Amharic

Hussien Seid Björn Gambäck

Computer Science & Information Technology Userware Laboratory

Arba Minch University Swedish Institute of Computer Science AB
PO Box 21, Arba Minch, Ethiopia Box 1263, SE-164 29 Kista, Sweden
[email protected] [email protected]

Abstract are excellent at treating temporal aspects by providing good ab-

stractions for sequences and a flexible topology for statistical
The paper discusses an Amharic speaker independent contin- phonology and syntax. However, HMMs have some drawbacks,
uous speech recognizer based on an HMM/ANN hybrid ap- especially for large vocabulary speaker independent continuous
proach. The model was constructed at a context dependent ASR. The main disadvantage is a relatively poor discrimina-
phone part sub-word level with the help of the CSLU Toolkit. A tion power. In addition HMMs enforce some practical require-
promising result of 74.28% word and 39.70% sentence recog- ments for distributional assumptions (e.g., uncorrelated features
nition rate was achieved. These are the best figures reported so within an acoustic vector) and typically make first order Markov
far for speech recognition for the Amharic language. model assumptions for phone or sub-phone states while ignor-
ing the correlation between acoustic vectors [2].
1. Introduction In effect, HMMs adopt a hierarchical scheme modeling a
The general objective of the present research was to examine sentence as a sequence of words, and each word as a sequence
and demonstrate the performance of a hybrid HMM/ANN sys- of sub-word units. An HMM can be defined as a stochastic fi-
tem for a speaker independent continuous Amharic speech re- nite state automaton, usually with a left-to-right topology when
cognizer. Amharic is the official language of communication used for speech. Each probability is approximated based on
for the federal government of Ethiopia and is today probably the maximum likelihood techniques. Still, these techniques have
second largest language in the country (after Oromo) and quite been observed for poor discrimination, since they maximize the
possibly one of the five largest on the African continent. It is likelihood of each individual node independently from the other.
estimated to be mother tongue of more than 17 million people, On the other hand neural network classifiers have shown good
with at least an additional 5 millions of second language speak- discrimination power, typically requires fewer assumptions, and
ers. Still, just as for many other African languages, Amharic can easily be integrated in non-adaptive architectures. This is
has received preciously little attention by the speech process- the point behind changing the pure HMM approach to the hy-
ing research community; even though the last years have seen brid HMM/ANN model, by using an ANN to augment the ASR
an increasing trend to investigate applying speech technology to system [3]. The HMM is used as the main structure of the
other languages than English, most of the work is still done on system to cope with the temporal alignment properties of the
very few and mainly European and East-Asian languages. Viterbi algorithm, while the ANN is used in a specific subsys-
The Ethiopian culture is ancient, and so are the written lan- tem of the recognizer to address static classification tasks. This
guages of the area, with Amharic using its very own script. This has shown performance improvement over pure HMM: Fritsch
has caused some problems in the digital age and even though & Finke [4] describe a tree-structural hierarchical HMM/ANN
there are several computer fonts for Amharic, and an encoding system which outperformed HMM on Switchboard.
of Amharic was incorporated into Unicode in 2000, the langu- In an HMM/ANN model a neural network of multi-layered
age still has no widely accepted computer representation. In perceptrons is given an input vector of acoustic observation
recent years there has been an increasing awareness of that Am- values, ot and computes a vector of output values which are
haric speech and language processing resources must be created approximate a-posteriori state probabilities. Commonly, nine
as well as digital information access and storage. frames are given for the input of the network: four consecu-
The present paper is a step in that direction. It is laid out tive frames before, four frames after, and one frame at time t,
as follows: Section 2 introduces the HMM/ANN hybrid ASR in order to provide the ANN with more contextual data. Then
paradigm. Section 3 discusses various aspects of Amharic and the network will have one output for each phone by restricting
some previous efforts to apply speech technology to the langu- the sum of all the output units to one. This helps to calculate the
age. Then Section 4 describes the actual experiments with con- a-posteriori probability, qj of a state j conditioned on the acous-
structing, evaluating, and testing an Amharic Automatic Speech tic input: p(qj |ot ). Generally an ASR system has a front end
Recognition System using the CSLU Toolkit [1]. in which the natural speech wave is digitized and parameterized
for the recognizer. The recognizer has a neural net to train on
these digitized and parameterized data. After training, the neu-
2. HMM/ANN hybrids ral net produces the estimation of probabilities of observations
Commonly, HMM-based speech recognizers have shown the for the HMM states. The HMM uses these probabilities and
best performance. On the positive side this dominant paradigm the language model to compute the probability of a sequence of
is based on a rich mathematical framework which allows for symbols given the observation sequence. Finally, the recognizer
powerful learning and decoding methods. In particular, HMMs uses decoders to generate the recognized symbols as output.

3349 September, 4-8, Lisbon, Portugal

INTERSPEECH 2005

3. Amharic Speech Processing ferent reference models in the database for the multiple forms
of the sound depending on the gemination. (Another problem
Ethiopia is with about 70 million inhabitants the third most pop-
is an ambiguity with the 6th order characters: whether they are
ulous African country and harbours some 80 different langu-
vowelled or not. However, this is not relevant to this work.)
ages. Three of these are dominant: Oromo, a Cushitic langu-
age is spoken in the South and Central parts of the country and
written using the Latin alphabet; Tigrinya, spoken in the North 3.2. Previous work
and in neighbouring Eritrea; and Amharic, spoken in most parts This study aims at investigating and testing out the possibility
of the country, but predominantly in the Eastern, Western, and of developing speaker independent continuous Amharic speech
Central regions. Amharic and Tigrinya are Semitic languages recognition systems using a hybrid of HMM and ANN systems.
and thus distantly related to Arabic and Hebrew. Speech and language technology for the languages of Ethiopia
is still very much unchartered territory; however, on the lan-
3.1. The Amharic language guage processing side some initial work has been carried out,
Following the Constitution of 1994, Ethiopia is a divided into mainly on Amharic word formation and information access.
nine fairly independent regions, each with its own nationality See [6] or [7] for short overviews of the efforts that have been
language. However, Amharic is the language for country-wide made so far to develop language processing tools for Amharic.
communication and was also for a long period the principal lan- Research conducted on speech technology for Ethiopian
guage for literature and the medium of instruction in primary languages has been even more limited. Laine [8] made a valu-
and secondary schools of the country (while higher education able effort to develop an Amharic text-to-speech synthesis sys-
is carried out in English). Amharic speakers are mainly Ortho- tem, and Tesfay [9] did similar work for Tigrinya.1 Solomon
dox Christians, with Amharic and Tigrinya drawing common [10] built speaker dependent and speaker independent HMM-
roots to the ecclesiastic Ge’ez still used by the Coptic church based isolated consonant-vowel syllable recognition systems
— both languages are written horizontally and left-to-right us- for Amharic. He proposed that CV-syllables would be the best
ing the Ge’ez script. Written Ge’ez can be traced back to at least candidates for the basic recognition units for Amharic.
the 4th century A.D. The first versions of the language included Solomon’s work was extended by Kinfe [11] who used the
consonants only, while the characters in later versions represent HTK Toolkit to build HMM word recognizers at three different
consonant-vowel (CV) phoneme pairs. sub-word levels: phoneme, tied-state triphone, and CV-syllable.
Amharic words use consonantal roots with vowel varia- Kinfe collected a 170 word vocabulary from 20 speakers. He
tion expressing difference in interpretation. In modern written considered a subset of the Amharic syllables, concentrating on
Amharic, each syllable pattern comes in seven different forms the combination of 20 phonemes with the seven vowels, or in to-
(called orders), reflecting the seven vowel sounds. The first or- tal 140 CV-units. Kinfe’s training and test sets both consisted of
der is the basic form; the other orders are derived from it by 50 discrete words. Contrary to Solomon’s predictions, the per-
more or less regular modifications indicating the different vow- formance of the syllable-level recognition was very bad (for un-
els. There are 33 basic forms, giving 7 ∗ 33 syllable patterns clear reasons) and Kinfe abandoned it in favour of the phoneme-
(syllographs), or fidEls. Two of the base forms represent vowels and triphone-based recognizers. For the latter two he reports an
in isolation (a and ), but the rest are for consonants (or semi- isolated word recognition accuracy of 83.1% resp. 78.0% on
vowels classed as consonants) and thus correspond to CV pairs, speaker dependent models, while the speaker independent mod-
with the first order being the base symbol with no explicit vowel els gave 75.5% for phoneme-based models and 77.9% isolated
indicator (though a vowel is pronounced: C+/9/). The writing word accuracy for tied-state triphone models.
system also includes four (incomplete, five-character) orders of Molalgne [12] tried to compare HMM-based small vocabu-
labialised velars and 24 additional labialised consonants. In to- lary speaker-specific continuous speech recognizers built using
tal, there are 275 fidEls. See, e.g., [5] for an introduction to the three different toolkits: CSLU, HTK, and MSSTATE Toolkit
Ethiopian writing system. from Mississippi State, but failed in setting up CSLU so that
The Amharic writing system uses multitudes of ways to de- only two toolkits were actually tested. He collected a corpus of
note compound words and there is no agreed upon spelling stan- 50 sentences with ten words (the digits) from a single speaker.
dard for compounds. As a result of this — and of the size of While HTK was clearly faster than MSSTATE, the speaker dep-
the country leading to vast dialectal dispersion — lexical vari- endent recognition performance for both systems was compara-
ation and homophony is very common. In addition, not all the ble with 82.5% resp. 79.0% word accuracy and 72.5% resp.
letters of the Amharic script are strictly necessary for the pro- 67.5% sentence accuracy for HTK resp. MSSTATE.
nunciation patterns of the spoken language; some were simply Martha [13] worked on a small vocabulary isolated word
inherited from Ge’ez without having any semantic or phonetic recognizer for a command and control interface to Microsoft
distinction in modern Amharic. There are many cases where Word, while Zegaye [14] continued the work on speaker indep-
numerous symbols are used to denote a single phoneme, as well endent continuous Amharic ASR. He used a pure HMM-based
as words that have extremely different orthographic form and approach and reached 76.2% word accuracy and 26.1% sen-
slightly distinct phonetics, but with the same meaning. So are, tence level accuracy. However, there are still a lot of work
for example, most labialised consonants basically redundant, to be done towards achieving a full-fledged automatic Amha-
and there are actually only 39 context-independent phonemes ric speech recognition system. The intention of the present re-
(monophones): of the 275 symbols of the script, only about 233 search was to use an HMM/ANN hybrid model approach as an
remain if the redundant ones are removed. alternative for better performance. For this we utilized an im-
In contrast to the character redundancy, there is no mecha- plementation of such a model in the CSLU Toolkit.
nism in the Amharic writing system to mark gemination of con-
sonants. The words /w5n5/ (swimming) and /w5nn5/ (main, 1 In the text we follow the practice of referring to Ethiopians by their
core) are both written as Å¹, but give two completely different given names. However, the reference list follows European standard
meanings by geminating the consonant n /n/. This requires dif- and also gives surnames (i.e., the father’s given name for an Ethiopian).

3350
INTERSPEECH 2005

4. An Amharic SR system sampling rate by Solomon [10]. 100 different sentences of read
speech were recorded for each speaker.
The attempt of this research is to design a prototype speech
The corpus was prepared and processed using
recognizer for the Amharic language. The recognizer uses
SpeechView, a part of the CSLU Toolkit providing a
phonemes as base units and is designed to recognize continu-
graphic-based interface to prepare speech data. The tool is used
ous speech and is speaker independent. In contrast to the pure
to record, display, save, and edit speech signals in their wave
HMM-based work done by Zegaye [14], the system implements
format. It also provides spectrograms and some other speech
the HMM/ANN hybrid model approach. The development pro-
wave related data like pitch and energy counters, neural net
cess was performed using the CSLU Toolkit installed on the
outputs, and phonetic labels. With the help of the SpeechView
Microsoft Windows 2000 platform. Various preprocessing pro-
tool, one can collect and prepare speech data in an easy way
grams and script editors were used to handle vocabulary files.
for training a recognizer. The process of annotating the speech
waveform, which is the most tedious and difficult process in
4.1. The CSLU Toolkit the development of speech recognition systems, can be done at
The CSLU Toolkit [1] was designed not only for speech recog- different transcription levels.
nition, but also for research and educational purposes in the area Ten spoken sentences each from ten female speakers were
of speech and human-computer interactions. It is developed and annotated at the phoneme level for the training corpus and time-
maintained by the Center of Speech Language Understanding, aligned word level transcriptions were generated automatically.
a research centre at the Oregon Graduate Institute of Science Two more speakers were annotated for evaluation purposes.
and Technology, Portland and the Center for Spoken Language Long silences at the beginning and end of the wave file were
Research at the University of Colorado. The toolkit, which is trimmed off and the boundaries of word-level transcriptions
available free of charge for educational, research, personal, and were adjusted accordingly.
evaluation purposes under a license agreement, supports core A vocabulary file was created based on the pronunciation
technologies for speech recognition and speech synthesis, plus of each word in the data set and parts of the phones. This gave a
a graphical based rapid application development environment vocabulary of 778 words represented by 34 phones that in turn
for developing spoken dialogue systems. mÍ
were split into 57 phone parts: , , , and were defined to
ms
consist of three parts each; 15 phones have two parts ( , , ,
gkqµfz}pÝ¥ ½
The toolkit supports the development of HMM or
, , , , , , , , , , , and ), while 15 have one part
ntylwrbdh v
HMM/ANN hybrid-based speech recognition systems. For this
purpose it has many modules or tools interacting with each other only ( , , , , , , , , , , , , , , and ). Each
in an environment called CSLU-HMM. The toolkit needs a con- phone group is here ordered internally according to frequency.
sistent organization and naming of directories and files which
has to be strictly followed. This is tedious work, but also clearly 4.3. Experiments
doable (still, this might have been the reason why Molalgne de-
cided that it was not possible to use the CSLU Toolkit [12]). Thereafter a recognizer was created, the frame vectors were
generated automatically in the toolkit, and the recognizers was
trained on the phone part files. The ANN of the recognizer con-
4.2. Speech data
tained an output layer with the phone parts, while the input layer
Apart from the specifics of the language itself, the main problem was a 180 node grid representing 20 features each from nine
with doing speech recognition for an under-resourced language time frames (t ± 4 ∗ 10ms).
like Amharic is the lack of previously available data: No stan- The recognizer was evaluated on two sentences each from
dard speech corpus has been developed for Amharic. However, ten speakers who were all found in the training data (in total 20
we were able to use a corpus of 50 speakers recorded at 16 kHz sentences and 236 words). The results were as shown in Table 1.

Itr Subst Insert Delete Word Acc Snt Corr Itr Subst Insert Delete Word Acc Snt Corr
15 13.62 4.89 5.83 75.66 42.31 15 16.34 5.87 7.00 70.79 35.27
16 13.62 5.83 5.83 74.72 42.31 16 16.34 7.00 7.00 69.65 35.17
17 13.62 4.89 6.83 74.67 41.72 17 16.34 5.87 8.20 69.59 33.79
18 14.61 4.89 5.83 74.67 42.31 18 17.53 5.87 7.00 69.60 34.27
19 15.56 3.89 4.89 75.66 41.72 19 18.68 4.66 5.87 70.80 33.79
20 11.67 5.79 4.89 77.65 42.90 20 14.00 6.93 5.87 73.20 36.75
21 11.67 5.83 4.89 77.61 42.90 21 14.00 7.00 5.87 73.13 35.35
22 14.61 5.83 5.83 73.73 41.13 22 17.53 7.00 7.00 68.46 33.62
23 13.62 4.89 4.89 76.61 42.90 23 16.34 5.87 5.87 71.92 37.75
24 13.62 2.93 5.79 77.66 42.90 24 16.34 3.52 6.95 73.19 34.75
25 14.61 2.93 4.89 77.57 42.31 25 17.53 3.52 5.87 73.08 34.27
26 14.61 4.89 4.89 75.62 42.31 26 17.53 5.87 5.87 70.73 34.27
27 15.56 3.89 4.89 75.66 42.31 27 18.68 4.66 5.87 70.80 34.27
28 12.66 3.89 4.89 78.56 44.07 28 15.19 4.66 5.87 74.28 39.70
29 12.66 5.83 4.89 76.62 42.31 29 15.19 7.00 5.87 71.94 35.27
30 12.66 4.89 4.89 77.56 42.90 30 15.19 5.87 5.87 73.07 35.64

Table 1: Recognition accuracy on known speakers. Table 2: Recognition accuracy on unknown speakers.
Best result: 78.56% word and 44.07% sentence level accuracy. Best result: 74.28% word and 39.70% sentence level accuracy.

3351
INTERSPEECH 2005

For each iteration the columns in Table 1 give the percentage of 7. References
substitutions, insertions, and deletions, as well as the word ac-
[1] J.-P. Hosom, R. Cole, M. Fanty, J. Schalkwyk, Y. Yan,
curacy, and the percentage of correct sentences. The best results
and W. Wei, “Training neural networks for speech
(78.56% word level accuracy and 44.07% sentence correctness)
recognition,” Webpage, Feb. 1999. [Online]. Available:
were obtained after 28 iterations.
speech.bme.ogi.edu/tutordemos/nnet training/tutorial.html
When the same recognizer was tested for another ten speak-
ers who were not included in the training data with two sen- [2] H. Bourlard and N. Morgan, “Hybrid HMM/ANN systems
tences each (218 words in total), the recognition rate degraded. for speech recognition: Overview and new research di-
As can be seen in Table 2, the best results were again obtained rections,” in Adaptive Processing of Sequences and Data
after the 28th iteration. The word accuracy was reduced by Structures, C. Giles and M. Gori, Eds. Springer-Verlag,
4.28%, while the sentence level recognition rate was reduced 1997, pp. 389–417.
by 4.37%, giving a 21.44% word level error rate and 55.93% [3] F. Beaufays, H. Bourlard, H. Franco, and N. Morgan,
sentence level error rate. “Neural networks in automatic speech recognition,” in The
Accordingly, the HMM/ANN hybrid recognizer gave a Handbook of Brain Theory and Neural Networks, 2nd ed.,
2.36% decrease in word error rate and 18.01% decrease in sen- M. Arbib, Ed. MIT Press, 2002, pp. 1076–1080.
tence error rate compared to Zegaye’s purely HMM-based re-
cognizer [14], which had 23.80% word and 73.94% sentence [4] J. Fritsch and M. Finke, “ACID/HNN clustering
error rates. The relative error reduction compared to Zegaye’s hiearchies of neural networks for context-dependent con-
work is thus 9.92% at the word level and 24.36% at the sen- nectionist acoustic modeling,” in Proc. International Con-
tence level. ference on Acoustics, Speech and Signal Processing.
Seattle, Washington: IEEE, Apr. 1998, pp. 505–508.
5. Conclusions [5] T. Bloor, “The Ethiopic writing system: a profile,” Jour-
nal of the Simplified Spelling Society, vol. 19, pp. 30–36,
The paper reported experiences with using the CSLU Toolkit 1995.
to build a hybrid HMM/ANN speaker independent continuous
speech recognizer for Amharic, the main language of Ethiopia. [6] Atelach Alemu, L. Asker, and Mesfin Getachew,
An annotated corpus was created from previously recorded “Natural language processing for Amharic: Overview and
speech data. Ten sentences each from twelve speakers were suggestions for a way forward,” in Proc. 10th Conference
marked up at the phoneme level and a vocabulary of 778 words ’Traitement Automatique des Langues Naturelles’, vol. 2,
was created. Batz-sur-Mer, France, June 2003, pp. 173–182.
For speakers found in the training data, the best results ob- [7] Samuel Eyassu and B. Gambäck, “Classifying Amharic
tained were 78.6% word and 44.1% sentence level accuracy. news text using Self-Organizing Maps,” in Proc. 43rd
When tested on data from ten previously unseen speakers, the Annual Meeting of the Association for Computational
recognizer had a 74.3% word accuracy and 39.7% sentence ac- Linguistics. Ann Arbor, Michigan, June 2005, Workshop
curacy; a relative error reduction of 24.4% compared to previ- on Computational Approaches to Semitic Languages.
ous work on Amharic, using pure HMM-based methods. [8] Laine Berhane, “Text-to-speech synthesis of the Amha-
The CSLU Toolkit proved to be a good vehicle to develop ric language,” MSc Thesis, Faculty of Technology, Addis
hybrid HMM/ANN-based recognizers, and the experiments in- Ababa University, Ethiopia, 1998.
dicate that a better recognizer can be developed with further op-
timization efforts. However, the implementation of the toolkit [9] Tesfay Yihdego, “Diphone based text-to-speech synthesis
in Windows needs some revisions. There were problems to fully system for Tigrigna,” MSc Thesis, Faculty of Informatics,
download the Toolkit Installer and after installation the system Addis Ababa University, Ethiopia, 2004.
integration with Windows required considerable efforts. [10] Solomon Berhanu, “Isolated Amharic consonant-vowel
syllable recognition: An experiment using the Hidden
6. Acknowledgements Markov Model,” Msc Thesis, School of Information Stud-
ies for Africa, Addis Ababa University, Ethiopia, 2001.
This research was carried out at the Department of Information
Science, Addis Ababa University and could not have come into [11] Kinfe Tadesse, “Sub-word based Amharic speech re-
being without the help of Solomon Berhanu who provided the cognizer: An experiment using Hidden Markov Model
corpus. Thanks to Zegaye Seifu and Kinfe Tadesse for construc- (HMM),” MSc Thesis, School of Information Studies for
tive comments and to Marek F. and Clemente Fragoso Eduardo Africa, Addis Ababa University, Ethiopia, June 2002.
for help with fixing CSLU Toolkit implementation problems. [12] Molalgne Girmaw, “An automatic speech recognition sys-
The work was funded by the Faculty of Informatics at Addis tem for Amharic,” MSc Thesis, Dept. of Signals, Sensors
Ababa University and the ICT support programme of SAREC, and Systems, Royal Institute of Technology, Stockholm,
the Department for Research Cooperation at Sida, the Swedish Sweden, Apr. 2004.
International Development Cooperation Agency. [13] Martha Yifiru, “Automatic Amharic speech recognition
system to command and control computers,” MSc Thesis,
School of Information Studies for Africa, Addis Ababa
University, Ethiopia, 2003.
[14] Zegaye Seifu, “HMM based large vocabulary, speaker in-
dependent, continuous Amharic speech recognizer,” MSc
Thesis, School of Information Studies for Africa, Addis
Ababa University, Ethiopia, 2003.

3352

View publication stats

BS EN 12524-2000 Hygrothermal Properties
No ratings yet
BS EN 12524-2000 Hygrothermal Properties
14 pages
Ace of PACE Sample Paper
55% (20)
Ace of PACE Sample Paper
5 pages
Change of Authorship Request Form: (Pre-Acceptance) Please Read The Important Information On Page 4 Before You Begin
No ratings yet
Change of Authorship Request Form: (Pre-Acceptance) Please Read The Important Information On Page 4 Before You Begin
5 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
9 pages
Speechrecognitionfinalpresentation 141124072610 Conversion Gate01
No ratings yet
Speechrecognitionfinalpresentation 141124072610 Conversion Gate01
30 pages
An Amalgamation of Integrated Features With Deepspeech2 Architecture and Improved Spell Corrector For Improving Gujarati Language Asr System
No ratings yet
An Amalgamation of Integrated Features With Deepspeech2 Architecture and Improved Spell Corrector For Improving Gujarati Language Asr System
13 pages
Speech Recognition Application
No ratings yet
Speech Recognition Application
13 pages
A Review On Different Approaches For Speech - Recognition System
No ratings yet
A Review On Different Approaches For Speech - Recognition System
6 pages
Voice Recognition
60% (5)
Voice Recognition
31 pages
Hidden Markov Model and Persian Speech Recognition
No ratings yet
Hidden Markov Model and Persian Speech Recognition
9 pages
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
No ratings yet
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
6 pages
Build Automatic Speech Recognition System: Bachelor of Technology
No ratings yet
Build Automatic Speech Recognition System: Bachelor of Technology
25 pages
ASR Proof
No ratings yet
ASR Proof
19 pages
Presentation On Speech Recognition
No ratings yet
Presentation On Speech Recognition
11 pages
Speech Recognition Using HMM ANN Hybrid Model
No ratings yet
Speech Recognition Using HMM ANN Hybrid Model
4 pages
Gebreegziabher 2020
No ratings yet
Gebreegziabher 2020
5 pages
(IJCST-V11I2P2) :pooja Shirude, Mohit Chaudhari, Gaurav Baviskar, Mahesh Kanhere
No ratings yet
(IJCST-V11I2P2) :pooja Shirude, Mohit Chaudhari, Gaurav Baviskar, Mahesh Kanhere
3 pages
Speech Recognition As Emerging Revolutionary Technology
No ratings yet
Speech Recognition As Emerging Revolutionary Technology
4 pages
Punjabi Speech Recognition: A Survey: by Muskan and Dr. Naveen Aggarwal
No ratings yet
Punjabi Speech Recognition: A Survey: by Muskan and Dr. Naveen Aggarwal
7 pages
Comparative Analysis of Automatic Speech Recognition Techniques
No ratings yet
Comparative Analysis of Automatic Speech Recognition Techniques
8 pages
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
100% (1)
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
65 pages
Automatic Speech Recognition 2
No ratings yet
Automatic Speech Recognition 2
22 pages
A Review On Speech Recognition Approaches and Challenges For Portuguese: Exploring The Feasibility of Fine-Tuning Large-Scale End-To-End Models
No ratings yet
A Review On Speech Recognition Approaches and Challenges For Portuguese: Exploring The Feasibility of Fine-Tuning Large-Scale End-To-End Models
13 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
17 pages
Voice Recognition System Speech To Text
No ratings yet
Voice Recognition System Speech To Text
5 pages
Speech Recognition: College Name: Guru Nanak Engineering College Authors: Shruthi Tapse
No ratings yet
Speech Recognition: College Name: Guru Nanak Engineering College Authors: Shruthi Tapse
13 pages
Easychair Preprint: Adnene Noughreche, Sabri Boulouma and Mohammed Benbaghdad
No ratings yet
Easychair Preprint: Adnene Noughreche, Sabri Boulouma and Mohammed Benbaghdad
8 pages
Ann LA2 Project
No ratings yet
Ann LA2 Project
23 pages
Vivek Kumar - 1613112052
No ratings yet
Vivek Kumar - 1613112052
7 pages
Synopsis
No ratings yet
Synopsis
5 pages
Redaction HTK Amazigh Speech
No ratings yet
Redaction HTK Amazigh Speech
15 pages
Hybrid HMM/Neural Network Based Speech Recognition in Loquendo ASR
No ratings yet
Hybrid HMM/Neural Network Based Speech Recognition in Loquendo ASR
7 pages
Speaker-Independent Phone Recognition Using Hidden Markov Models PDF
No ratings yet
Speaker-Independent Phone Recognition Using Hidden Markov Models PDF
8 pages
Ai Project Sona-1 (1) - 250630 - 194118
No ratings yet
Ai Project Sona-1 (1) - 250630 - 194118
10 pages
The Development Process and Current State of The Speech Recognition Technology
No ratings yet
The Development Process and Current State of The Speech Recognition Technology
8 pages
Arabrecognizer: Modern Standard Arabic Speech Recognition Inspired by Deepspeech2 Utilizing Franco Arabic
No ratings yet
Arabrecognizer: Modern Standard Arabic Speech Recognition Inspired by Deepspeech2 Utilizing Franco Arabic
14 pages
Continuous Density Hidden Markov Model For Hindi Speech Recognition
No ratings yet
Continuous Density Hidden Markov Model For Hindi Speech Recognition
7 pages
Speech Recognition For Mobile Systems: BY: Pratibha Channamsetty Shruthi Sambasivan
No ratings yet
Speech Recognition For Mobile Systems: BY: Pratibha Channamsetty Shruthi Sambasivan
36 pages
11IASRUCSS186
No ratings yet
11IASRUCSS186
5 pages
Speech Recognition of Isolated Words Usi
No ratings yet
Speech Recognition of Isolated Words Usi
10 pages
Comp Sci - Recognition Isolated - Shanthi Teressa1
No ratings yet
Comp Sci - Recognition Isolated - Shanthi Teressa1
6 pages
Write: Get Unlimited Access To The Best of Medium For Less Than $1/week
No ratings yet
Write: Get Unlimited Access To The Best of Medium For Less Than $1/week
19 pages
Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
35 pages
A Review On Speech Recognition Challenge
No ratings yet
A Review On Speech Recognition Challenge
7 pages
An In-Depth Analysis of Automatic Speech Recognition System
No ratings yet
An In-Depth Analysis of Automatic Speech Recognition System
5 pages
Arabic Speech Recognition Systems
No ratings yet
Arabic Speech Recognition Systems
8 pages
Implementing A Hidden Markov Model Speech Recognit
No ratings yet
Implementing A Hidden Markov Model Speech Recognit
12 pages
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
No ratings yet
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
6 pages
Development & Evaluation of Different Acoustic Models For Malayalam Continuous Speech Recognition
No ratings yet
Development & Evaluation of Different Acoustic Models For Malayalam Continuous Speech Recognition
8 pages
TASLP2339736 Proof 2
No ratings yet
TASLP2339736 Proof 2
26 pages
ASR Brief History: Trends Followed at Different Point of Time
No ratings yet
ASR Brief History: Trends Followed at Different Point of Time
2 pages
Alemayehu Yilma
No ratings yet
Alemayehu Yilma
67 pages
Xiao Guest Lecture ASR
No ratings yet
Xiao Guest Lecture ASR
39 pages
Tutorial On Speech Recognition: Alex Acero Microsoft Research
No ratings yet
Tutorial On Speech Recognition: Alex Acero Microsoft Research
38 pages
End-to-End Automatic Speech Recognition
No ratings yet
End-to-End Automatic Speech Recognition
19 pages
IT Report-1
No ratings yet
IT Report-1
14 pages
Automatic Urdu Speech Recognition Using
No ratings yet
Automatic Urdu Speech Recognition Using
5 pages
Voice Recognition System
No ratings yet
Voice Recognition System
4 pages
NLP 1.3.1 - Speed Recogmnition
No ratings yet
NLP 1.3.1 - Speed Recogmnition
20 pages
Electrical Engineering (2017-2021) Punjab Engineering College, Chandigarh - 160012
No ratings yet
Electrical Engineering (2017-2021) Punjab Engineering College, Chandigarh - 160012
23 pages
Speech Recognition: Fundamentals and Applications
From Everand
Speech Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Writing Assignment 1: Name: Date
No ratings yet
Writing Assignment 1: Name: Date
3 pages
CV
No ratings yet
CV
2 pages
Homework II Cover Page
No ratings yet
Homework II Cover Page
1 page
2003 Test and Answers
No ratings yet
2003 Test and Answers
5 pages
Course Title: Course Number: Credit Hours: Instructor
No ratings yet
Course Title: Course Number: Credit Hours: Instructor
4 pages
Chapter 11: File System Implementation: Silberschatz, Galvin and Gagne ©2009 Operating System Concepts - 8 Edition
No ratings yet
Chapter 11: File System Implementation: Silberschatz, Galvin and Gagne ©2009 Operating System Concepts - 8 Edition
34 pages
Chapter 24 Spectroscopic Methods
No ratings yet
Chapter 24 Spectroscopic Methods
44 pages
Module - 7 Lecture Notes - 2 Mixed Integer Programming: y C B X
No ratings yet
Module - 7 Lecture Notes - 2 Mixed Integer Programming: y C B X
3 pages
User Manual GALILEO: 06/2013 MN04802104Z-EN
No ratings yet
User Manual GALILEO: 06/2013 MN04802104Z-EN
17 pages
Type of Proportions
No ratings yet
Type of Proportions
20 pages
Tutorial 20. Modeling Solidification
No ratings yet
Tutorial 20. Modeling Solidification
32 pages
(Cambridge Mathematical Textbooks) Shahriar Shahriari - An Invitation To Combinatorics-Cambridge University Press (2021)
No ratings yet
(Cambridge Mathematical Textbooks) Shahriar Shahriari - An Invitation To Combinatorics-Cambridge University Press (2021)
636 pages
Solid State (IITian Notes - Kota)
No ratings yet
Solid State (IITian Notes - Kota)
43 pages
Steam Jet Spindle Operated Thermocompressor
No ratings yet
Steam Jet Spindle Operated Thermocompressor
3 pages
Microsoft Excel 2007 Chris Menard
No ratings yet
Microsoft Excel 2007 Chris Menard
20 pages
Ideal Gas
No ratings yet
Ideal Gas
20 pages
Question Bank Unit - I-II - III - IV-V Eng - Maths-I
No ratings yet
Question Bank Unit - I-II - III - IV-V Eng - Maths-I
3 pages
FlashLoanExample Sol
No ratings yet
FlashLoanExample Sol
3 pages
A. Rupasri (20NE1A0510) Sk. Rehamunnisha (20NE1A0539) D. Sai Supriya (20NE1A0542) Sk. Mohammad Fahim (20NE1A0551)
No ratings yet
A. Rupasri (20NE1A0510) Sk. Rehamunnisha (20NE1A0539) D. Sai Supriya (20NE1A0542) Sk. Mohammad Fahim (20NE1A0551)
20 pages
Edexcel IGCSE Mathematics B 4MB1 Revision Notes
No ratings yet
Edexcel IGCSE Mathematics B 4MB1 Revision Notes
42 pages
Grade 2 Tos Sum1
No ratings yet
Grade 2 Tos Sum1
5 pages
Study of Suspension System in All Terrain Vehicle: Presented by
No ratings yet
Study of Suspension System in All Terrain Vehicle: Presented by
14 pages
IFOS Presentation-PAK Mobilink0704
No ratings yet
IFOS Presentation-PAK Mobilink0704
13 pages
Folds: Nomenclature, Classification & Recognition
0% (1)
Folds: Nomenclature, Classification & Recognition
22 pages
How To Know (Check) My Own Mobile Number - Airtel, Idea, Jio Vodafone, Tata Docomo, Reliance, BSNL, Aircel, MTNL, Videocon, Virgin, Uninor
No ratings yet
How To Know (Check) My Own Mobile Number - Airtel, Idea, Jio Vodafone, Tata Docomo, Reliance, BSNL, Aircel, MTNL, Videocon, Virgin, Uninor
3 pages
SISS S13 LiuJian FHE by LiuJian
No ratings yet
SISS S13 LiuJian FHE by LiuJian
7 pages
HCIE-R&S Huawei Certified Internetwork Expert-Routing and Switching Material V1.1
No ratings yet
HCIE-R&S Huawei Certified Internetwork Expert-Routing and Switching Material V1.1
1,212 pages
Intrinsic Viscosities and Unperturbed Dimensions of Long Chain Molecules
No ratings yet
Intrinsic Viscosities and Unperturbed Dimensions of Long Chain Molecules
117 pages
Interview Questions All
No ratings yet
Interview Questions All
13 pages
BasicMath F4 2022
No ratings yet
BasicMath F4 2022
6 pages
Determine and Describe The Intersection of Sets Using Various Representations and B
No ratings yet
Determine and Describe The Intersection of Sets Using Various Representations and B
18 pages
Global Elevation Data Download Tool - January 15, 2025
No ratings yet
Global Elevation Data Download Tool - January 15, 2025
5 pages
Kobelco 6E - Hyd Motors PDF
100% (1)
Kobelco 6E - Hyd Motors PDF
26 pages
Revised Notes Chapter 1
No ratings yet
Revised Notes Chapter 1
16 pages

A Speaker Independent Continuous Speech Recognizer For Amharic

Uploaded by

A Speaker Independent Continuous Speech Recognizer For Amharic

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

A speaker independent continuous speech recognizer for Amharic

Conference Paper · September 2005

Hussien Seid Björn Gambäck

PRESEMT View project

The user has requested enhancement of the downloaded file.

A Speaker Independent Continuous Speech Recognizer for Amharic

Hussien Seid Björn Gambäck

Computer Science & Information Technology Userware Laboratory

Abstract are excellent at treating temporal aspects by providing good ab-

3349 September, 4-8, Lisbon, Portugal

View publication stats

You might also like