0% found this document useful (0 votes)

17 views14 pages

Sec 3 - Speech Recognition - Intro

The document explains sound as vibrations that carry information and distinguishes between voice and speech. It details the process of Automatic Speech Recognition (ASR), which converts acoustic signals into text, and outlines the characteristics of audio files, including formats like MP3 and WAV, as well as concepts like audio sampling, sample rate, and bit depth. These elements are crucial for understanding audio quality and resolution in digital recordings.

Uploaded by

amrt6958

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views14 pages

Sec 3 - Speech Recognition - Intro

Uploaded by

amrt6958

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Speech recognition

Sec 3
What is Sound?

vibrations that travel

through the air Heard by
person’s ear

• Sound is air pressure variation .

• Sound is information bearer.
What is voice?

• Voice → refers to the sound produced by

humans

• Speech →is what comes out the mouth after

the sound is modified by the throat muscles,
palate, tongue, lips, teeth, etc
(ASR) Automatic Speech
Automatic speech Recognition
recognition (ASR)

Is the process of converting an acoustic signal, captured by microphone

or a telephone, to a set of words
Audio signal acquisition
1
Preprocessing
2
ASR
Feature extraction
Processing 3
Steps
Classification
4
Recognition (text)
5
Characteristics of Audio Files
Characteristics of Audio Files

MP3 (not hi-res):

• Popular,
• lossy compressed format
• ensures small file size but is far from the best sound quality.
• Convenient for storing music on phones and iPods.

WAV (hi-res):
• The standard format in which all CDs are encoded.
• Great sound quality but it's uncompressed, meaning huge file sizes
(especially for hi-res files).
• It has poor metadata support (that is, album artwork, artist and
song title information).
Characteristics of Audio Files

Sound as a Waveform
Characteristics of Audio Files
Audio Sampling → Conversion from analog sound waves to digital audio:-

amplitude

time

The audio signal in a file represents a series of samples that capture the
amplitude of the sound over time.
Characteristics of Audio Files

•The sample rate:

• the number of audio samples recorded each second. It
is measured in Hertz or samples per second.
• Is crucial (particularly for reproducing high
frequencies)
• The more samples you take, the more closely the
final digital file will resemble the original.
Characteristics of Audio Files

•Bit Depth:
• Every sample taken while making an audio recording
needs to be stored within your computer’s ‘bits’.

• The number of possible amplitude values we can

record for each audio sample.

• The more bits you use to record each sample, the

better the sound reproduction.

• The most common audio bit depths are 16-bit, 24-bit,

and 32-bit.
Characteristics of Audio Files

Increasing the audio bit depth, along with increasing the audio sample
rate, creates more total points to reconstruct the analog wave.
sample rate vs bit depth
Sample rate
Bit depth
determines the number
determines how many
of snapshots taken to
amplitude values each of
recreate the original
those snap shots contain.
sound wave

Together bit depth and

sample rate work
together to determine
audio resolution.
For example, a sound wave like this can be sampled at each time
sample point:

The sound recorded at each sample point is converted to its nearest numeric equivalent:

Spring Boot Notes 1
No ratings yet
Spring Boot Notes 1
83 pages
Multimedia System: Chapter Five: Basics of Digital Audio
No ratings yet
Multimedia System: Chapter Five: Basics of Digital Audio
42 pages
Manual SIRIUS ACT With PROFINET IO en-US
No ratings yet
Manual SIRIUS ACT With PROFINET IO en-US
122 pages
(Type Text) : Menara Maybank, 100 Jalan Tun Perak, 50050 Kuala Lumpur, Malaysia Telephone +603 2070 8833 Telex MA 32837
No ratings yet
(Type Text) : Menara Maybank, 100 Jalan Tun Perak, 50050 Kuala Lumpur, Malaysia Telephone +603 2070 8833 Telex MA 32837
7 pages
Marvel Demo
100% (1)
Marvel Demo
11 pages
6-Audio Basics FileType MP3Compr
100% (1)
6-Audio Basics FileType MP3Compr
145 pages
System Analysis and Design
75% (4)
System Analysis and Design
2 pages
Chapter 5-Sound
100% (1)
Chapter 5-Sound
40 pages
Digital Audio
No ratings yet
Digital Audio
9 pages
Chapter4 Audio
No ratings yet
Chapter4 Audio
34 pages
Ch05 - Multimedia Element-Sound
100% (3)
Ch05 - Multimedia Element-Sound
40 pages
TinkerCAD For Everyone
100% (1)
TinkerCAD For Everyone
15 pages
Multimedia - LECTURE 3
No ratings yet
Multimedia - LECTURE 3
37 pages
3.01 Intro To Digital Audio
100% (1)
3.01 Intro To Digital Audio
11 pages
Lecture5 SOUND
No ratings yet
Lecture5 SOUND
26 pages
Visdes Final
No ratings yet
Visdes Final
379 pages
1.1.3 Sound
No ratings yet
1.1.3 Sound
10 pages
Melfa Basic Iv: Main Main Characteristics of Characteristics of RV-2AJ
No ratings yet
Melfa Basic Iv: Main Main Characteristics of Characteristics of RV-2AJ
8 pages
03 Digital Audio
No ratings yet
03 Digital Audio
23 pages
Enhancing Production With Audio
No ratings yet
Enhancing Production With Audio
62 pages
Multimedia Digital Audio
No ratings yet
Multimedia Digital Audio
7 pages
Chapter 5
No ratings yet
Chapter 5
68 pages
CS 550 Multimedia&WS 2 SOUND v1
No ratings yet
CS 550 Multimedia&WS 2 SOUND v1
41 pages
Chapter 4 - Multimedia Element Sound
No ratings yet
Chapter 4 - Multimedia Element Sound
32 pages
Audio Technology 01
No ratings yet
Audio Technology 01
23 pages
CS 152 Computer Architecture and Engineering Multicycle Controller Design
No ratings yet
CS 152 Computer Architecture and Engineering Multicycle Controller Design
49 pages
Chapter 2 SOUND AUDIO Systems
No ratings yet
Chapter 2 SOUND AUDIO Systems
58 pages
Audio: History - Sound Definition - Digital Audio - Codecs - Trends - Softwares
No ratings yet
Audio: History - Sound Definition - Digital Audio - Codecs - Trends - Softwares
48 pages
Sound Handling
No ratings yet
Sound Handling
32 pages
Sound
No ratings yet
Sound
35 pages
Multimedia Unit 3
No ratings yet
Multimedia Unit 3
27 pages
CH # 4 Audio
No ratings yet
CH # 4 Audio
55 pages
Unit 2
No ratings yet
Unit 2
26 pages
Sound #1
No ratings yet
Sound #1
28 pages
Sounds' Final Slide - PPSX
No ratings yet
Sounds' Final Slide - PPSX
98 pages
Multimedia 4
No ratings yet
Multimedia 4
24 pages
Commonly Used Approaches To Real-Time Scheduling
No ratings yet
Commonly Used Approaches To Real-Time Scheduling
40 pages
Prof Sanket J Shah: Prepared By: Alpha College of Enginering & Technology
No ratings yet
Prof Sanket J Shah: Prepared By: Alpha College of Enginering & Technology
31 pages
Audio Info & Media
No ratings yet
Audio Info & Media
25 pages
(SAMS) Safety Assessment Management System - Cebu Pacific Air
No ratings yet
(SAMS) Safety Assessment Management System - Cebu Pacific Air
22 pages
5 Basics of Digital Audio
No ratings yet
5 Basics of Digital Audio
29 pages
Introduction To 2d Drawing and Orthographic Projection
No ratings yet
Introduction To 2d Drawing and Orthographic Projection
36 pages
Sound (Audio) Chapter 6
No ratings yet
Sound (Audio) Chapter 6
35 pages
Digital Audio & Quantization and Transmission of Audio
No ratings yet
Digital Audio & Quantization and Transmission of Audio
17 pages
Chapter 01 Introduction
No ratings yet
Chapter 01 Introduction
33 pages
Sandesh Powerpoint Dautya Final
No ratings yet
Sandesh Powerpoint Dautya Final
21 pages
Sound Presentation
No ratings yet
Sound Presentation
56 pages
Audio Representation - LECTURE
No ratings yet
Audio Representation - LECTURE
28 pages
MVC Paper
No ratings yet
MVC Paper
14 pages
Problem Solving by Search
No ratings yet
Problem Solving by Search
14 pages
Department of Software Engineering Assignment of Multimedia System Course Code (Seng 3102)
No ratings yet
Department of Software Engineering Assignment of Multimedia System Course Code (Seng 3102)
18 pages
Digital Audio: Teppo Räisänen Liike/Oamk
No ratings yet
Digital Audio: Teppo Räisänen Liike/Oamk
18 pages
Sound 1
No ratings yet
Sound 1
35 pages
14 Sound
No ratings yet
14 Sound
20 pages
Module 1 - Database Systems
No ratings yet
Module 1 - Database Systems
9 pages
Cs Important Questions by Ujjwal
No ratings yet
Cs Important Questions by Ujjwal
19 pages
Sound Design and Mixing in Reason
From Everand
Sound Design and Mixing in Reason
Andrew Eisele
3/5 (2)
SIM-PA Simplified Consensus Protocol Simulator Applications To Proof of Reputation-X and Proof of Contribution
No ratings yet
SIM-PA Simplified Consensus Protocol Simulator Applications To Proof of Reputation-X and Proof of Contribution
12 pages
Chapter4 Sound
No ratings yet
Chapter4 Sound
39 pages
Chapter 6
No ratings yet
Chapter 6
20 pages
MM-Lecture 2 Audio
No ratings yet
MM-Lecture 2 Audio
35 pages
Sec 4 - Audio Signal Acquisition - Read&Write Wave - Plot
No ratings yet
Sec 4 - Audio Signal Acquisition - Read&Write Wave - Plot
12 pages
Lect 4
No ratings yet
Lect 4
14 pages
Digital Audio Concept
No ratings yet
Digital Audio Concept
13 pages
Audio
No ratings yet
Audio
20 pages
Programming-2 Sheets
No ratings yet
Programming-2 Sheets
9 pages
Sound / Audio: Multimedia Fundamentals
No ratings yet
Sound / Audio: Multimedia Fundamentals
57 pages
Multimedia System Anddesign: (Audio)
No ratings yet
Multimedia System Anddesign: (Audio)
49 pages
OS - Unit 1 - Notes
No ratings yet
OS - Unit 1 - Notes
15 pages
(Rahman) Assignment#1
No ratings yet
(Rahman) Assignment#1
9 pages
Chapter 7 Auio
No ratings yet
Chapter 7 Auio
22 pages
A-Level Revision Notes - 31B Sound
No ratings yet
A-Level Revision Notes - 31B Sound
12 pages
05-Solidworks Advanced Part Modeling 2018
100% (4)
05-Solidworks Advanced Part Modeling 2018
473 pages
Online Communities
No ratings yet
Online Communities
9 pages
Audio: History - Sound Definition - Digital Audio - Codecs - Trends - Softwares
No ratings yet
Audio: History - Sound Definition - Digital Audio - Codecs - Trends - Softwares
46 pages
Sound and Music
No ratings yet
Sound and Music
5 pages
Tutorial Audacity (Iwan)
No ratings yet
Tutorial Audacity (Iwan)
30 pages
Solve The Sheet 12
No ratings yet
Solve The Sheet 12
4 pages
Data Representation L5 Sound
No ratings yet
Data Representation L5 Sound
9 pages
Notes: Sound Editing: Frequency:The Frequency Is The Number of Peaks and Troughs Per Second and Is Given As
No ratings yet
Notes: Sound Editing: Frequency:The Frequency Is The Number of Peaks and Troughs Per Second and Is Given As
4 pages
A Digital Audio Primer: Waveforms
No ratings yet
A Digital Audio Primer: Waveforms
4 pages
IP MODEL 1 QST Set 2
No ratings yet
IP MODEL 1 QST Set 2
4 pages
Multimedia Sound
No ratings yet
Multimedia Sound
4 pages
Artifact Fycelium Product Comparison 2020-03-31 1329
No ratings yet
Artifact Fycelium Product Comparison 2020-03-31 1329
3 pages
The MySQL Query Cache Presentation
No ratings yet
The MySQL Query Cache Presentation
24 pages
Saheli Deb: Education
No ratings yet
Saheli Deb: Education
2 pages
Iphone Pricelist - Sep 2024
No ratings yet
Iphone Pricelist - Sep 2024
1 page
Music Technology: Digitalization of Sound
No ratings yet
Music Technology: Digitalization of Sound
4 pages
Getting Started With VDI
No ratings yet
Getting Started With VDI
1 page
Audio
No ratings yet
Audio
4 pages
Multimedia
No ratings yet
Multimedia
2 pages
Digital Audio
No ratings yet
Digital Audio
4 pages
Installation Yum in Installed Mailman Is Because The Old Version To Install From Source
No ratings yet
Installation Yum in Installed Mailman Is Because The Old Version To Install From Source
3 pages
Digital Marketing
No ratings yet
Digital Marketing
13 pages
(Ebook) Get Programming With JavaScript by John R. Larsen ISBN 9781617293108, 1617293105 Instant Download
100% (4)
(Ebook) Get Programming With JavaScript by John R. Larsen ISBN 9781617293108, 1617293105 Instant Download
63 pages

Sec 3 - Speech Recognition - Intro

Uploaded by

Sec 3 - Speech Recognition - Intro

Uploaded by

Speech recognition

vibrations that travel

• Sound is air pressure variation .

• Voice → refers to the sound produced by

• Speech →is what comes out the mouth after

Is the process of converting an acoustic signal, captured by microphone

MP3 (not hi-res):

•The sample rate:

• The number of possible amplitude values we can

• The more bits you use to record each sample, the

• The most common audio bit depths are 16-bit, 24-bit,

Together bit depth and

You might also like