Speech Segmentation

The document reviews four research papers on automatic speech segmentation. Paper 1 focuses on phoneme-level segmentation of Amharic speech using Hidden Markov Models. Paper 2 presents a method for searching Amharic speech using text queries based on sentence-level segmentation. Paper 3 provides an overview of speech segmentation techniques and challenges. Paper 4 discusses blind and aided segmentation algorithms for Amharic phonemes using HMMs, DTW, and ANNs. The review aims to provide insights into advances and challenges in automatic speech segmentation by examining these papers' approaches, methods, applications, findings and limitations.

Uploaded by

Abebe Tora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views

Speech Segmentation

Uploaded by

Abebe Tora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

WOLAITA SODO UNIVERSITY SHOOL OF INFORMATICS

DEPARTMENT OF INFORMATION TECHNOLOGY

IT MSc Regular

Course: - IMS
NLP Article review

by:-

Abebe Tora Pgr/82835/15

Submitted To: - Dr. Siraj S.

Sub. Date: - Jan 3/2024

1
Title and authors

1. Automatic Speech Segmentation(April 2017). Alaa Ehab Sakran 1, Sherif Mahdy Abdou
23, Salah Eldeen Hamid 4, Mohsen Rashwan 25

2. Amharic Speech Search Using Text Word Query Based on Automatic Sentence-like
Segmentation.( 8November2022). Getnet Mezgebu Brhanemeskel 1 ,SolomonTeferraAbate
1 ,Tewodros Alemu Ayall 2,3,* and Abegaz Mohammed Seid 3,†

3. Automatic Speech Segmentation for Amharic Phonemes Using Hidden Markov Model
Toolkit (HTK)( Aug 2016). Eshete Derb Emiru [1], Walelign Tewabe Sewunetie [2]

4. Phoneme level automatic speech segmentation for Amharic language using HMM
approach.by Dr. Sebsbie Hailmariam.

1
Introduction:

For more than thirty years, researchers have been studying automated speech segmentation in
an effort to divide speech signals into smaller pieces for use in voice synthesis and recognition,
among other applications. I present a thorough analysis of four research papers that examine
various strategies and developments in automatic voice segmentation in this article.
The first paper focuses on phoneme-level automatic speech segmentation for the Amharic
language using a Hidden Markov Model (HMM) approach. The authors highlight the importance
of accurate segmentation in speech processing systems and discuss the utilization of wavelets,
fuzzy methods, artificial neural networks, and HMM for segmentation.
The second article present a method based on automatic sentence-like segmentation, enabling
users to search for specific words in Amharic speech using text-based queries. The authors
emphasize the significance of automatic segmentation in speech analysis and its applications in
speech recognition and speech synthesis systems.
The third paper provides a comprehensive review of automatic speech segmentation techniques.
They discuss the general characteristics of speech signals, including voiced and unvoiced speech,
and the importance of accurate segmentation for various speech analysis tasks. The authors explore
different segmentation units such as words, phonemes, and syllables, and discuss the challenges
associated with context dependency and acoustic variability.
The fourth paper focuses on the segmentation of speech signals using both blind and aided
segmentation algorithms. The authors discuss the differences between blind segmentation, which
relies solely on statistical signal analysis, and aided segmentation, which incorporates external
linguistic knowledge. They highlight the use of techniques such as Hidden Markov Models
(HMMs), Dynamic Time Warping (DTW), and Artificial Neural Networks (ANNs) in aided
segmentation algorithms.
In this review, I aim to provide insights into the advancements and challenges in automatic
speech segmentation techniques. By examining these four articles, I will gain a better
understanding of the various approaches, methodologies, and applications in this field.

No. Authors Titles Methods Findings limitations

1. Alaa Ehab Automatic Wavelet, Fuzzy methods, Speech synthesis, training o Lack of up-to-date
Sakran. et al Speech Artificial Neural Networks, for speech recognizers, information
(April- Segmentation. and Hidden Markov Models. and prosodic database o Incomplete
2017) creation. The authors information
highlight the advantages of o Does not explicitly
automatic segmentation mention the specific
over manual segmentation, evaluation metrics.
such as consistency and o The article does not
time efficiency. explicitly mention
future work

2. Getnet Amharic Speech  They used manual The findings of the study o a limited training
Mezgebu Search Using segmentation as a baseline for indicate that sentence-like dataset
Brhanemesk Text Word Query Word error rate (WER) of the automatic segmentation o lack of detailed
el. et al Based on automatic segmentation resulted in a WER closer information on the
( 2022) Automatic approach, Artificial Neural to the WER achieved on dataset and
Sentence-like Network manually segmented test validation process
Segmentation speech. They used two
speech bodies, broadcast
news domain and spiritual
domain,
3. Eshete Derb Automatic Unsupervised method for In a context-dependent o The article does not
Emiru [1], Speech automatic speech setting with two Gaussian explicitly discuss the
Walelign Segmentation for segmentation using the mixtures, the phoneme- limitations of the
Tewabe Amharic Hidden Markov Model based technique produced proposed method.
Sewunetie Phonemes Using (HMM) Toolkit (HTK). the best results in terms of o Does not address the
[2].(Aug Hidden Markov Techniques, such as context- the lowest percentage of performance of the
2016). Model Toolkit independent, context- time boundary deviations. method on different
(HTK) dependent with single For the purpose of several speakers or in noisy
Gaussian mixture, and speech research fields, the environments.
context-dependent with suggested approach o speech corpus was
multiple Gaussian mixtures. effectively divided recorded by a single
Amharic speech into female speaker
phonemes.
4. Dr. Sebsbie Phoneme level Hidden Markov Model The proposed method o The performance of
Hailmariam. automatic speech (HMM) approach for effectively segments the system in
segmentation for modeling the Amharic continuous speech into capturing variations
Amharic phonemes. phonemes in the Amharic in speech due to
language using Techniques used are context- language. different speakers,
HMM approach. independent, context- accents, and other
dependent with single factors not
Gaussian mixture, and recognized.
context-dependent with o Study focuses on the
multiple Gaussian mixtures. Amharic language
only.
Describing authors with their titles, used methods, their findings and also limitations of articles are
shown in table format bellow.
Compression
Compression of all articles by their strength, contributions for area and their evaluation metrics.
No. Strengths contributions Evaluation metrics
1. Mentions various approaches and the basics of speech segmentation, does not explicitly
methods used in speech segmentation discussing state-of-the-art mention the specific
solutions, exploring different evaluation metrics
segmentation units, examining
evaluation methods, and
highlighting the challenges and
trends in automatic speech
segmentation.
2.  Focuses on the issue of speech search The proposed approach aims to Word Error Rate (WER)
using text word queries for the enable efficient and accurate as a measure of
Amharic language, which can have searching of Amharic speech by performance.
practical applications. automatically segmenting the
 Introduces the concept of automatic speech into meaningful units and
sentence-like segmentation, which aligning them with text queries.
may enhance the accuracy of the
speech search system.
 Includes multiple authors, indicating a
collaborative effort and potentially
diverse perspectives.

3.  Novelty:- introduces an unsupervised Contributes to the field of speech Percentage of boundary

method for automatic speech segmentation by proposing an deviations tolerance
 Methodology:- describes the use of automated approach specifically values (5ms, 10ms,
Hidden Markov Model (HMM) designed for the Amharic 15ms, and 20ms)
toolkit (HTK) for modeling Amharic language and demonstrating its compared to manual
phonemes, effectiveness through segmentation results.
 Data Preparation: - collection and experimental evaluation. This measure accuracy
preparation of both the text and of the system
speech corpora used in the
experiments.
 Evaluation: - evaluates the
performance of the segmentation
system by comparing it to manual
segmentation results.
4.  Focuses on automatic speech  By proposing an HMM (Hidden o Percentage of boundary
segmentation for the Amharic Markov Model) approach for deviations
language, which is a valuable automatically segmenting o Recognition accuracy
contribution to the field. Amharic language at the phoneme o Boundary alignment:
 Utilizes the Hidden Markov Model level. The proposed method aims measures the
(HMM) approach, which is a to improve speech processing consistency and
commonly used and effective method systems, such as speech precision of the system
for speech segmentation. recognition and synthesis, o Time efficiency
 The author's expertise in the field is
indicated by their title as "Dr."

Future works
1. The authors mention the need to investigate and develop more advanced algorithms that can
handle the challenges posed by noisy and non-standard speech data.
They also highlight the importance of incorporating linguistic knowledge and context into
segmentation algorithms. Furthermore, the authors suggest exploring novel features and techniques
for improved segmentation accuracy and efficiency.
2. The authors propose a method that combines automatic speech recognition with automatic
sentence-like segmentation and provide experimental results to support their findings.
3. The study has potential limitations related to the size of the text corpus and the speaker
characteristics. Further research can address these limitations and explore the generalization and
robustness of the proposed method in diverse settings.
4. The article does not explicitly mention future work.
 Based on this review I will try to do Automatic Speech Segmentation for wolaita language.

Learn Hot English - Beginner Book (A1)
100% (11)
Learn Hot English - Beginner Book (A1)
114 pages
Sementation HTK
No ratings yet
Sementation HTK
3 pages
Speech Recognition With Hidden Markov Model: A Review
100% (1)
Speech Recognition With Hidden Markov Model: A Review
4 pages
A Speaker Independent Continuous Speech Recognizer For Amharic
No ratings yet
A Speaker Independent Continuous Speech Recognizer For Amharic
5 pages
Presentation On Speech Recognition
No ratings yet
Presentation On Speech Recognition
11 pages
A Literature Survey of Speech Recognition and Hidden Markov Models
No ratings yet
A Literature Survey of Speech Recognition and Hidden Markov Models
6 pages
(IJCST-V11I2P2) :pooja Shirude, Mohit Chaudhari, Gaurav Baviskar, Mahesh Kanhere
No ratings yet
(IJCST-V11I2P2) :pooja Shirude, Mohit Chaudhari, Gaurav Baviskar, Mahesh Kanhere
3 pages
Hindi Speech Important Recognition System Using HTK
No ratings yet
Hindi Speech Important Recognition System Using HTK
12 pages
Amharic ASR Project Proposal
No ratings yet
Amharic ASR Project Proposal
7 pages
Punjabi Speech Recognition: A Survey: by Muskan and Dr. Naveen Aggarwal
No ratings yet
Punjabi Speech Recognition: A Survey: by Muskan and Dr. Naveen Aggarwal
7 pages
A Study On Automatic Speech Recognition
100% (1)
A Study On Automatic Speech Recognition
2 pages
Hidden Markov Model and Persian Speech Recognition
No ratings yet
Hidden Markov Model and Persian Speech Recognition
9 pages
Arabic Speech Recognition Challenges and State of The Art
No ratings yet
Arabic Speech Recognition Challenges and State of The Art
27 pages
Automatic Speech Segmentation in Syllable Centric Speech Recognition System
No ratings yet
Automatic Speech Segmentation in Syllable Centric Speech Recognition System
10 pages
Development & Evaluation of Different Acoustic Models For Malayalam Continuous Speech Recognition
No ratings yet
Development & Evaluation of Different Acoustic Models For Malayalam Continuous Speech Recognition
8 pages
Build Automatic Speech Recognition System: Bachelor of Technology
No ratings yet
Build Automatic Speech Recognition System: Bachelor of Technology
25 pages
Gebreegziabher 2020
No ratings yet
Gebreegziabher 2020
5 pages
ASR Proof
No ratings yet
ASR Proof
19 pages
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
No ratings yet
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
6 pages
Research Proposal
No ratings yet
Research Proposal
6 pages
A Novel Voice Recognition Model Based On HMM and Fuzzy PPM
No ratings yet
A Novel Voice Recognition Model Based On HMM and Fuzzy PPM
4 pages
An In-Depth Analysis of Automatic Speech Recognition System
No ratings yet
An In-Depth Analysis of Automatic Speech Recognition System
5 pages
ZiadAlBawab PhDThesis PDF
No ratings yet
ZiadAlBawab PhDThesis PDF
138 pages
Easychair Preprint: Adnene Noughreche, Sabri Boulouma and Mohammed Benbaghdad
No ratings yet
Easychair Preprint: Adnene Noughreche, Sabri Boulouma and Mohammed Benbaghdad
8 pages
Redaction HTK Amazigh Speech
No ratings yet
Redaction HTK Amazigh Speech
15 pages
Speech Recognition Using Artificial Neural Network: - A Review
No ratings yet
Speech Recognition Using Artificial Neural Network: - A Review
4 pages
Continuous Density Hidden Markov Model For Hindi Speech Recognition
No ratings yet
Continuous Density Hidden Markov Model For Hindi Speech Recognition
7 pages
Analysis of Complex Non-Linear Environment Exploration in Speech Recognition by Hybrid Learning Technique
No ratings yet
Analysis of Complex Non-Linear Environment Exploration in Speech Recognition by Hybrid Learning Technique
8 pages
Automatic Speech Recognition (Attempt) : ECE 113DB Final Project, Winter 2019 Fong Chi Ho, Zijun Sun, Shao Xiong Lee
No ratings yet
Automatic Speech Recognition (Attempt) : ECE 113DB Final Project, Winter 2019 Fong Chi Ho, Zijun Sun, Shao Xiong Lee
4 pages
A Voice Identification System Using Hidden Markov Model
No ratings yet
A Voice Identification System Using Hidden Markov Model
6 pages
Segmentation of Connected Arabic Characters Using Hidden Markov Models
No ratings yet
Segmentation of Connected Arabic Characters Using Hidden Markov Models
5 pages
Write: Get Unlimited Access To The Best of Medium For Less Than $1/week
No ratings yet
Write: Get Unlimited Access To The Best of Medium For Less Than $1/week
19 pages
Voice Recognition System Speech To Text
No ratings yet
Voice Recognition System Speech To Text
5 pages
Effect of Dynamic Time Warping On Alignment of Phrases and Phonemes
No ratings yet
Effect of Dynamic Time Warping On Alignment of Phrases and Phonemes
6 pages
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
100% (1)
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
65 pages
Proceeding of The 3rd International Conference On Informatics and Technology
No ratings yet
Proceeding of The 3rd International Conference On Informatics and Technology
7 pages
Speaker-Independent Phone Recognition Using Hidden Markov Models PDF
No ratings yet
Speaker-Independent Phone Recognition Using Hidden Markov Models PDF
8 pages
Speech Recognition: College Name: Guru Nanak Engineering College Authors: Shruthi Tapse
No ratings yet
Speech Recognition: College Name: Guru Nanak Engineering College Authors: Shruthi Tapse
13 pages
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
No ratings yet
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
5 pages
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
No ratings yet
Feature Extraction Methods LPC, PLP and MFCC in Speech Recognition
5 pages
Feature Extraction Methods LPC, PLP and MFCC
100% (1)
Feature Extraction Methods LPC, PLP and MFCC
5 pages
Comp Sci - Recognition Isolated - Shanthi Teressa1
No ratings yet
Comp Sci - Recognition Isolated - Shanthi Teressa1
6 pages
Speech Recognition Application
No ratings yet
Speech Recognition Application
13 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
9 pages
Speech Recognition Using HMM ANN Hybrid Model
No ratings yet
Speech Recognition Using HMM ANN Hybrid Model
4 pages
Effect of MFCC Based Features For Speech Signal Alignments
No ratings yet
Effect of MFCC Based Features For Speech Signal Alignments
7 pages
Effect of MFCC Based Features For Speech Signal Alignments
No ratings yet
Effect of MFCC Based Features For Speech Signal Alignments
7 pages
A Hindi Speech Recognition System For Connected Wo
No ratings yet
A Hindi Speech Recognition System For Connected Wo
8 pages
Automatic Isolated Digit Recognition System: An Approach Using HMM
No ratings yet
Automatic Isolated Digit Recognition System: An Approach Using HMM
3 pages
What HMMs Can Do
No ratings yet
What HMMs Can Do
24 pages
Speech To Text Conversion Using Amharic Characters
No ratings yet
Speech To Text Conversion Using Amharic Characters
88 pages
Fyp Proposal
No ratings yet
Fyp Proposal
4 pages
Corso Muryanto PURE SP09
No ratings yet
Corso Muryanto PURE SP09
10 pages
Addis Ababa University School of Graduate Studies: By: Nadew Tademe Mergia A
No ratings yet
Addis Ababa University School of Graduate Studies: By: Nadew Tademe Mergia A
135 pages
d 0332836
No ratings yet
d 0332836
9 pages
Speaker Dependent Continuous Kannada Speech Recognition Using HMM
No ratings yet
Speaker Dependent Continuous Kannada Speech Recognition Using HMM
4 pages
FARSDAT
No ratings yet
FARSDAT
12 pages
Literature Review On Automatic Speech Recognition
No ratings yet
Literature Review On Automatic Speech Recognition
9 pages
Is 2016 7737405
No ratings yet
Is 2016 7737405
6 pages
1 Paper
No ratings yet
1 Paper
9 pages
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
3C's, Regression and Dimension Reduction in Machine Learning.
No ratings yet
3C's, Regression and Dimension Reduction in Machine Learning.
3 pages
Data Processing
No ratings yet
Data Processing
5 pages
Basic Design Lecture 1
No ratings yet
Basic Design Lecture 1
47 pages
Basic Design Lecture 2
No ratings yet
Basic Design Lecture 2
35 pages
Pulse 2 Voc&Gram Extension-33
No ratings yet
Pulse 2 Voc&Gram Extension-33
1 page
Sarezor
No ratings yet
Sarezor
597 pages
Teks Speech
No ratings yet
Teks Speech
3 pages
2022-B Autonomous Learning Evidence Worksheet 2022 PDF
No ratings yet
2022-B Autonomous Learning Evidence Worksheet 2022 PDF
2 pages
Ngugi Wa Thiong - Decolonizing Mind 9
No ratings yet
Ngugi Wa Thiong - Decolonizing Mind 9
2 pages
Grade 4 Module 7 Pronouns and Their Antecedents
No ratings yet
Grade 4 Module 7 Pronouns and Their Antecedents
7 pages
1.2.1 SD
No ratings yet
1.2.1 SD
68 pages
Grade 2 DLL ENGLISH Q4 Week 4
No ratings yet
Grade 2 DLL ENGLISH Q4 Week 4
11 pages
RPH BI YEAR 1 (L1-5) Hbs
100% (1)
RPH BI YEAR 1 (L1-5) Hbs
8 pages
Telegram App in Learning English: EFL Students' Perceptions
No ratings yet
Telegram App in Learning English: EFL Students' Perceptions
12 pages
NLP-UNIT-I FINAL
No ratings yet
NLP-UNIT-I FINAL
31 pages
Borang Transit PDB 1 Ar Razi
No ratings yet
Borang Transit PDB 1 Ar Razi
21 pages
Mind Map Jasandro Ariel Rachmady
No ratings yet
Mind Map Jasandro Ariel Rachmady
1 page
The Grapes Of Wrath Macmillan Readers Level 6 2009th Edition John Steinbeck - Quickly download the ebook to never miss important content
No ratings yet
The Grapes Of Wrath Macmillan Readers Level 6 2009th Edition John Steinbeck - Quickly download the ebook to never miss important content
72 pages
Teach Yourself Esperanto Book
100% (11)
Teach Yourself Esperanto Book
205 pages
70 Powerful F1 Visa Interview Tips
No ratings yet
70 Powerful F1 Visa Interview Tips
8 pages
Oral Communication in Context Midterm Exam
100% (1)
Oral Communication in Context Midterm Exam
2 pages
7-Ci Sinif Diaqnostik 2 Variant and English Books
No ratings yet
7-Ci Sinif Diaqnostik 2 Variant and English Books
2 pages
Rpt-Sow Form 1 2024
No ratings yet
Rpt-Sow Form 1 2024
7 pages
GENRE OF LITERATURE-print
No ratings yet
GENRE OF LITERATURE-print
5 pages
Sosio Nigeria
No ratings yet
Sosio Nigeria
2 pages
s3 - Cup New Y7 English LB
No ratings yet
s3 - Cup New Y7 English LB
249 pages
DPS Balco Class 2
No ratings yet
DPS Balco Class 2
10 pages
Phonetics
No ratings yet
Phonetics
33 pages
Flashcards On Vocabulary
No ratings yet
Flashcards On Vocabulary
43 pages
Managing Children Positively - Carol Read Etp38p4
No ratings yet
Managing Children Positively - Carol Read Etp38p4
5 pages
Standards of English (Hickey)
No ratings yet
Standards of English (Hickey)
31 pages
High School English Language Arts Companion Document: Power of Language Module
No ratings yet
High School English Language Arts Companion Document: Power of Language Module
25 pages
مراجعة الوحدة الثالثة اختبار الفترة Stories
No ratings yet
مراجعة الوحدة الثالثة اختبار الفترة Stories
3 pages

Speech Segmentation

Uploaded by

Speech Segmentation

Uploaded by

WOLAITA SODO UNIVERSITY SHOOL OF INFORMATICS

DEPARTMENT OF INFORMATION TECHNOLOGY

Abebe Tora Pgr/82835/15

Submitted To: - Dr. Siraj S.

Sub. Date: - Jan 3/2024

No. Authors Titles Methods Findings limitations

3.  Novelty:- introduces an unsupervised Contributes to the field of speech Percentage of boundary

You might also like