0% found this document useful (0 votes)

49 views50 pages

Audio-Video I: P Kadebu

The document discusses digital audio and multimedia technologies. It covers topics like psychoacoustics, digital representation of sound, digital images, and compression methods. It also discusses digital audio applications and technologies like digital signal processing.

Uploaded by

sharon mkdauenda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views50 pages

Audio-Video I: P Kadebu

Uploaded by

sharon mkdauenda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 50

Audio-Video I

P Kadebu
 Introduction
 Digital audio
› Psycho acoustics
› Digital presentation of sound
 Digital images
› JPEG
Introduction
 Multimedia uses often sound, images, and videos
from natural sources
 First, the source has to be converted into signal
› microphone, camera, video camera
 Next, analog signals are converted into digital
› sampling, A/D-transformation
 Often, amount of the information is also reduced
› compression
Introduction …….Cont’d
 Compression method can be lossless or lossy
 Compressed information is easier to store and
transfer
 Compressed information has to be decompressed
before use
 Digital to analog conversion has also to be done
 After this, the signal can be played (or shown) to
the user
Digital audio
 Technology that can be used to record, store, generate, manipulate,
and reproduce sound using audio signals encoded in digital form.
 A microphone converts sound to an analog electrical signal, then
an analog-to-digital converter (ADC)—typically using pulse-code
modulation—converts the analog signal into a digital signal.
 A digital-to-analog converter performs the reverse process,
converting a digital signal back into an analog signal, which analog
circuits amplify and send to a loudspeaker.
 Digital audio systems may
include compression, storage, processing and transmission compon
ents. Conversion to a digital format allows convenient
manipulation, storage, transmission and retrieval of an audio
signal.
Digital audio application areas

 Computer generated sound

 Sound storage and processing
 Digital communications
 Answering service
 Speech synthesis
 Speech recognition
 Computerized call center
 Presentation of data as sound (Sonification)
Computer generated sound

 Sound that has been created using computing

technology.
 Computer-generated music is music composed by, or
with the extensive aid of, a computer.
 Since the invention of the MIDI system in the early
1980s, for example, some people have worked on
programs which map MIDI notes to an algorithm
and then can either output sounds or music through
the computer's sound card or write an audio file for
other programs to play.
Sound storage and processing
 Sound can be stored as Digital files by Digital Sound Processing.
 A digital audio system starts with an ADC that converts an analog
signal to a digital signal
 A digital audio signal may be stored or transmitted. Digital audio
can be stored on a CD, a digital audio player, a hard drive, a USB
flash drive, or any other digital data storage device.
 The digital signal may then be altered through digital signal
processing, where it may be filtered or have effects applied.
 Audio data compression techniques, such as MP3, Advanced
Audio Coding, Ogg Vorbis, or FLAC, are commonly employed to
reduce the file size.
 Digital audio can be streamed to other devices.
 For playback, digital audio must be converted back to an analog
signal with a DAC.
Digital communications

 Data transmission, digital transmission,
or digital communications is the physical
transfer of data (a digital bit stream or a digitized
analog signal) over a point-to-point or point-to-
multipoint communication channel.
Answering service

 The purpose of an answering service is to offer assistance or

record messages. It can be used to:
 Offer help services eg technical assistance for configuring a
machine
 Responding on behalf of a phone user to record messages in
their absence or when they are unable to take a call.
 Answer services usually fit into one of the following
categories, although answering-service providers often offer
services from several categories:
› Automated answering service
› Live answering service
› Internet answering service
› Call center
Speech synthesis

 The artificial production of human speech.

 A computer system used for this purpose is called
a speech synthesizer, and can be implemented
in software or hardware
products.
 A text-to-speech (TTS) system converts normal language
text into speech; other systems render symbolic linguistic
representations like phonetic transcriptions into speech
 May be used to assist deaf/dumb people in
communication
Speech recognition
 It is the ability of a machine or program to
identify words and phrases in spoken language
and convert them to a machine-readable format.
 Speech recognition applications include voice
user interfaces such as voice dialling, call
routing, domotic (Home automation) appliance
control, search, simple data entry, speech-to-text
processing (e.g., word processors or emails).
Presentation of data as sound (Sonification)

 The use of non-speech audio to

convey information or perceptualize data.
 Auditory perception has advantages in temporal,
amplitude, and frequency resolution that open
possibilities as an alternative or complement
to visualization techniques.
Psycho acoustics
 Frequency band
 Dynamic range
 Frequency properties
 Time effect
 Masking
 Phase
 Binaural hearing and localization
Frequency band
 Frequency bands are groupings of a specific
range of frequencies in the radio frequency (RF)
spectrum.
 Humans can hear frequencies 20 Hz - 20kHz
(hearing range).
 Older people have much narrower range
 The threshold of hearing is approximately the
quietest sound a young human with undamaged
hearing can detect at 1,000 Hz.
 The ear's sensitivity is best at frequencies
between 1 kHz and 5 kHz.
Dynamic range

 At pain level, amplitude can be 1,000,000 times the sound at

the threshold of audibility
 The measurement unit is dB
 The threshold is 0 dB and pain level
100 - 120 dB
 On the decibel scale, the smallest audible sound (near total
silence) is 0 dB. A sound 10 times more powerful is 10 dB. A
sound 100 times more powerful than near total silence is 20
dB. A sound 1,000 times more powerful than near total
silence is 30 dB.
 Hearing is a sense, which cannot be directly measured
› the pitch of a sound changes according to amplitude
› the loudness depends on frequency
Frequency properties

 Natural sound are sums of many frequencies

 The frequencies can be calculated with Fourier
analysis
 Natural sounds contain typically harmonic
frequencies of the base frequency
 Ear is sensitive to valleys and hills of the spectrum
 Distinctive spots are called formants (a range of
frequencies [of a complex sound] in which there is
an absolute or relative maximum in the sound
spectrum".)
Clarinet sound
Time effect
 Sounds of instruments have three parts:
› Attack, steady-state, and decay
 In simple synthesis, sound is generated with
frequency components and their loudness is
changed at different stages
 In real, the frequency components of the
spectrum change constantly
 Hearing is especially sensitive in attack phase
Masking
 Auditory masking occurs when the perception
of one sound is affected by the presence of
another sound
 Sounds can mask each other partly or fully
 They can also change each other
 A certain frequency sound changes the threshold
of audibility around larger area
 The sounds have to be critical distance away so
that they are heard separately
 Critical limit grows as frequency get higher
Masking (cont.)
Phase
 It is the fraction of the wave cycle that has
elapsed relative to the origin.
 Same frequency sounds can have different phases
 A phase of 180 degrees cancels the sound
 There is evidence that, humans can hear phase
difference
Binaural hearing and
localization
 Binaural hearing refers to being able to integrate information
that the brain receives from the two ears.
 Binaural hearing is known to help us with the ability to listen
in noisy, complex auditory environments, and to localize
sound sources.

 Humans can determine the location of sound

› loudness, phase difference, frequency
 Skull, ear lopes, and hearing organs filter sound
 In addition, reflections have strong effect
 Sound sources should be placed in the same location and
visual information
Digital presentation of sound

 Coding in time domains

 Transformations
 Linear prediction
 Parametric coding
 Digital transfer of audio
Coding in time domain
 Samples are taken at sample frequency
 The sample frequency has to be at least twice the
maximum frequency (so called Nyquist
frequency)
 Common sample frequencies are 8, 44.1, and 48
kHz
 The value of amplitude at sampling moment is
coded as numeric value
Aliasing
In signal processing and related disciplines, aliasing is an effect that causes different
signals to become indistinguishable (or aliases of one another) when sampled
Pulse Code Modulation (PCM)
PCM is a method used to digitally represent sampled analog signals. It is the standard
form of digital audio in computers, Compact Discs, digital telephony and other digital
audio applications.
Coding in time domain (cont.)
 Sampling causes quantization error
 Each bit improves the signal to noise ratio
› 20 log 10 2 = 6 dB
 Often 16 bits are used
› 16 * 6 dB = 96 dB
 The human dynamic hearing range is more
(about 120 dB)
Coding in time dimension (cont.)

 Transformations between analog and digital

signals are done with A/D and D/A converters
 In addition, filtering is required
› Anti-alias and Reconstruction filters
 In high quality systems, they can also be error
source
 This problem can be solved with oversampling
 The computer can also cause over hearing
Transformations

 Transformations can be used to present the

content in different domain
 The goal is to make the signal transfer more
efficient and robust
Fourier transformation
 Fourier coefficients represent the signal
accurately in frequency domain
 Static signals can be present presented exactly
with Fourier coefficients
 Discrete Fourier transformation has to be used
with dynamic signals
 Coefficients are usually calculated with the Fast
Fourier Transformation (FFT) algorithm
Frequency bands

 Masking effect can also be used in coding

 Signal is first divided into frequency bands,
which are then coded separately (Subband
Coding)
 E.g., Mini Disc -records (Sony), DCC cassettes
(Philips) and MP3
 The methods has also been used with speech
coding and recognition
MPEG audio

 MPEG audio uses Subband coding

 Signal is divided into 32 bands (Layer 1)
 The division is done to groups of 384 samples
 FFT transformation is used to find band with
pure sine signals and noise
 Only interesting channels are coded
 The bit allocation per channel varies
 Layer I over 128 kbps / channel
MPEG audio (cont.)
 Layer 2  Layer 3 (MP3)
 about 128 kbps /  about 64 kbps /
 channel  channel
 1152 samples per  filter bank
group  Huffman coding
 3 scaling factors  bits used for coding
 36 frequency bands can vary
JPEG
 Objectives
 Architectures
 DCT coding and quantization
 Statistical coding
 Lossless coding
 Efficiency
Objectives

 Compression rate / image quality can be selected

 Works with all kinds of images
 Both software and hardware implementation
 Four different modes
› sequential coding (original order)
› progressive coding (multiphase coding)
› lossless coding (perfect copy)
› hierarchical coding (many resolutions)
JPEG Architectures

 Lossy modes use DCT for 8 x 8 pixel blocks

 Sequential mode outputs the DCT coefficients
block by block
 Progressive mode outputs the DCT coefficient in
groups
 Hierarchical mode encodes several resolutions at
the same time
Sequential JPEG
Progressive JPEG
Hierarchical JPEG
Lossless JPEG
DCT and Quantization

 The DCT-coefficients can be represented as a

matrix
 The quantization is done according to a
quantization table
 The coefficients are put in Zig-Zag order
 This places the zero coefficients in the end of the
run
 Finally Run-Length coding eliminates the zeros
DCT coding
DCT Coefficients
Statistical Coding

 Uses either Huffman or arithmetic coding

 Huffman coding requires a separate table
 Arithmetic coding does not require a table, but
need more computation
 In addition, the compression ratio of arithmetic
coding is 5 - 10 % better
Lossless Encoding

 Lossless encoding utilizes prediction

 Seven different alternatives
› how many and which pixels are used
 Predictive encoding can reach compression ratio
of 2:1
Efficiency

 0,25 - 0,5 bpp: reasonable - good quality

 0,5 - 0,75 bpp: good - very good quality
 0,75 - 1,5 bpp: very good quality
 1,5 - 2,00 bpp: same as original

Mul c2
No ratings yet
Mul c2
86 pages
Multimedia Making It Work Chapter 4 - Sound
50% (4)
Multimedia Making It Work Chapter 4 - Sound
49 pages
Multimedia Audio and Videos
No ratings yet
Multimedia Audio and Videos
41 pages
Multimedia Chapter 2 Multimedia Basics and Representation 1
No ratings yet
Multimedia Chapter 2 Multimedia Basics and Representation 1
57 pages
Multimedia-Systems 03 2017 18
No ratings yet
Multimedia-Systems 03 2017 18
65 pages
Chapter 4 Sound (Audio)
No ratings yet
Chapter 4 Sound (Audio)
23 pages
Chapter 06 Multimedia Part 2
No ratings yet
Chapter 06 Multimedia Part 2
21 pages
5 Synth LMMS1
No ratings yet
5 Synth LMMS1
19 pages
Computers in Daily Life
No ratings yet
Computers in Daily Life
10 pages
Chapter 6
No ratings yet
Chapter 6
49 pages
MIL Group 3
No ratings yet
MIL Group 3
25 pages
Streaming Audio and Video
No ratings yet
Streaming Audio and Video
54 pages
Chapter 2 1712933473252
No ratings yet
Chapter 2 1712933473252
11 pages
Unit 2 Multimedia and Computer Vision
No ratings yet
Unit 2 Multimedia and Computer Vision
25 pages
Chapter 5 Fundamentals of Audio
No ratings yet
Chapter 5 Fundamentals of Audio
43 pages
Sound and Audio
No ratings yet
Sound and Audio
42 pages
Chapter-5 - Analog and Digital Signals
No ratings yet
Chapter-5 - Analog and Digital Signals
6 pages
Chapter4 Audio
No ratings yet
Chapter4 Audio
34 pages
Digital Audio
No ratings yet
Digital Audio
29 pages
Lecture Notes - Unit 4
No ratings yet
Lecture Notes - Unit 4
42 pages
Lecture 5 - Digital and Analog Signal
No ratings yet
Lecture 5 - Digital and Analog Signal
7 pages
Principles of Digital Audio (5th Ed) - KC Pohlmann (McGraw-Hill)
No ratings yet
Principles of Digital Audio (5th Ed) - KC Pohlmann (McGraw-Hill)
10 pages
Document
No ratings yet
Document
6 pages
Audio and MIDI Notes
No ratings yet
Audio and MIDI Notes
3 pages
CS 550 Multimedia&WS 2 SOUND v1
No ratings yet
CS 550 Multimedia&WS 2 SOUND v1
41 pages
Chapt 03 - 02
No ratings yet
Chapt 03 - 02
30 pages
Chapter 5 Sound CSC253
No ratings yet
Chapter 5 Sound CSC253
15 pages
Sound
No ratings yet
Sound
35 pages
Department of Software Engineering Assignment of Multimedia System Course Code (Seng 3102)
No ratings yet
Department of Software Engineering Assignment of Multimedia System Course Code (Seng 3102)
18 pages
Introduction To Multimedia. Analog-Digital Representation
100% (1)
Introduction To Multimedia. Analog-Digital Representation
29 pages
Chapter 2
No ratings yet
Chapter 2
8 pages
Introduction To Digital Speech Processing
No ratings yet
Introduction To Digital Speech Processing
42 pages
Multimedia Audio and Videos
No ratings yet
Multimedia Audio and Videos
41 pages
Topic: Audio Media and Information Module Content: Lesson 12
No ratings yet
Topic: Audio Media and Information Module Content: Lesson 12
4 pages
Multi 1
No ratings yet
Multi 1
11 pages
Chapter 5-Sound
100% (1)
Chapter 5-Sound
40 pages
Ch05 - Multimedia Element-Sound
100% (3)
Ch05 - Multimedia Element-Sound
40 pages
Audio Information and Media
64% (25)
Audio Information and Media
41 pages
Unit 3
No ratings yet
Unit 3
3 pages
Makalah Audio
No ratings yet
Makalah Audio
6 pages
Chapter5 Midi
No ratings yet
Chapter5 Midi
50 pages
Chapter4 Sound
No ratings yet
Chapter4 Sound
35 pages
Chapter 4
No ratings yet
Chapter 4
35 pages
Digital Speech Processing
No ratings yet
Digital Speech Processing
7 pages
Dolby Digital
100% (2)
Dolby Digital
85 pages
14ec3029 Speech and Audio Signal Processing
No ratings yet
14ec3029 Speech and Audio Signal Processing
30 pages
Chapter 4 - Sound
No ratings yet
Chapter 4 - Sound
29 pages
Com 416 Chapter Four
No ratings yet
Com 416 Chapter Four
3 pages
Audio Information and Media
No ratings yet
Audio Information and Media
16 pages
Fender Mustang - v2 Manual PDF
100% (1)
Fender Mustang - v2 Manual PDF
14 pages
Chapter 2 Multimedia Basics and Representation
No ratings yet
Chapter 2 Multimedia Basics and Representation
57 pages
Cymatics - The Art of Making Beats PDF
94% (18)
Cymatics - The Art of Making Beats PDF
81 pages
Free Orgasmic Abundance E-Book
100% (1)
Free Orgasmic Abundance E-Book
9 pages
Multimedia System Anddesign: (Audio)
No ratings yet
Multimedia System Anddesign: (Audio)
49 pages
Audio Information and Media
100% (9)
Audio Information and Media
3 pages
Jazz Articulation PDF
100% (1)
Jazz Articulation PDF
8 pages
Final Mde Reviewer
No ratings yet
Final Mde Reviewer
6 pages
A Level 1.1.3 Sound
No ratings yet
A Level 1.1.3 Sound
15 pages
Kodwo Eshun - More Brilliant Than The Sun - Adventures in Sonic Fiction
100% (4)
Kodwo Eshun - More Brilliant Than The Sun - Adventures in Sonic Fiction
121 pages
ASTM E1050 Impedancia Acustica
No ratings yet
ASTM E1050 Impedancia Acustica
11 pages
HarBal Tutorials
No ratings yet
HarBal Tutorials
62 pages
Definition of Listening
No ratings yet
Definition of Listening
9 pages
Harbinger LVL L1202FX Manual
No ratings yet
Harbinger LVL L1202FX Manual
16 pages
The Rudiments of Highlife Music
No ratings yet
The Rudiments of Highlife Music
18 pages
Week 7 - Sound Cards & Digital Audio
No ratings yet
Week 7 - Sound Cards & Digital Audio
10 pages
Sony CFD s100l
No ratings yet
Sony CFD s100l
11 pages
4540897d47fd878f36adbdaf3ffa2996[1]
No ratings yet
4540897d47fd878f36adbdaf3ffa2996[1]
156 pages
Chapter 17: Real-Time Entry: Hyperscribe and Transcription Mode
No ratings yet
Chapter 17: Real-Time Entry: Hyperscribe and Transcription Mode
56 pages
Multimedia Hardware: Ms P Kadebu
No ratings yet
Multimedia Hardware: Ms P Kadebu
40 pages
Multimedia Software: Ms P Kadebu
No ratings yet
Multimedia Software: Ms P Kadebu
48 pages
Multimedia Software: Ms P Kadebu
No ratings yet
Multimedia Software: Ms P Kadebu
48 pages
1st QUARTERLY ASSESSMENT IN SCIENCE 8 - SY2022-2023
No ratings yet
1st QUARTERLY ASSESSMENT IN SCIENCE 8 - SY2022-2023
4 pages
205-K09-013 Data Sheets - Condenser Fans Rev 1.0
No ratings yet
205-K09-013 Data Sheets - Condenser Fans Rev 1.0
5 pages
Volvo HU 803
No ratings yet
Volvo HU 803
8 pages
Complete Works List Jan2018
No ratings yet
Complete Works List Jan2018
1 page
What Is Dithering in Audio?
No ratings yet
What Is Dithering in Audio?
7 pages
Unit 2 - Sound
No ratings yet
Unit 2 - Sound
24 pages
Analysis of The Hero Myth in Dark Souls Series
No ratings yet
Analysis of The Hero Myth in Dark Souls Series
74 pages
Caminito Trio VL CL PN
No ratings yet
Caminito Trio VL CL PN
6 pages
Arduino Music and Audio Projects
No ratings yet
Arduino Music and Audio Projects
1 page
B&W SA250MKII Datasheet
No ratings yet
B&W SA250MKII Datasheet
2 pages
Festivals and Theatrical Forms g7q4
No ratings yet
Festivals and Theatrical Forms g7q4
13 pages
DLL Science 4 q3 w9 New (Repaired)
No ratings yet
DLL Science 4 q3 w9 New (Repaired)
7 pages
Water Level Management Using Ultrasonic SensorAuto
No ratings yet
Water Level Management Using Ultrasonic SensorAuto
7 pages
Solutions: Cardiff University Examination Paper
No ratings yet
Solutions: Cardiff University Examination Paper
18 pages
Opus8 Manual
No ratings yet
Opus8 Manual
8 pages
Music of South East
No ratings yet
Music of South East
9 pages
Clone Ensemble DXi VST PlugIns Pack WiN x86-ViP - Magesy® R-Evolution™ (ORiGiNAL)
No ratings yet
Clone Ensemble DXi VST PlugIns Pack WiN x86-ViP - Magesy® R-Evolution™ (ORiGiNAL)
2 pages
8 Ways of Creating Contrast Between Song Sections - The Essential Secrets of Songwriting
No ratings yet
8 Ways of Creating Contrast Between Song Sections - The Essential Secrets of Songwriting
1 page
Ics 217 Test 1: A. You Are Planning To Open A Recording Studio in Town. Describe The Various Multimedia
No ratings yet
Ics 217 Test 1: A. You Are Planning To Open A Recording Studio in Town. Describe The Various Multimedia
1 page
Sound Design and Mixing in Reason
From Everand
Sound Design and Mixing in Reason
Andrew Eisele
3/5 (2)
ABCs of Audio Recording
From Everand
ABCs of Audio Recording
Jon Bellona
No ratings yet
So,You Want To Be An Audio Engineer: A Complete Beginners Guide For Selecting Audio Gear
From Everand
So,You Want To Be An Audio Engineer: A Complete Beginners Guide For Selecting Audio Gear
Kevin Parker
5/5 (1)
Analog vs. Digital: Choosing Your Sound Path
From Everand
Analog vs. Digital: Choosing Your Sound Path
PRODUCERGENIE
No ratings yet
Noise Reduction: Enhancing Clarity, Advanced Techniques for Noise Reduction in Computer Vision
From Everand
Noise Reduction: Enhancing Clarity, Advanced Techniques for Noise Reduction in Computer Vision
Fouad Sabry
No ratings yet

Audio-Video I: P Kadebu

Uploaded by

Audio-Video I: P Kadebu

Uploaded by

Audio-Video I

 Computer generated sound

 Sound that has been created using computing

 The purpose of an answering service is to offer assistance or

 The artificial production of human speech.

 The use of non-speech audio to

 At pain level, amplitude can be 1,000,000 times the sound at

 Natural sound are sums of many frequencies

 Humans can determine the location of sound

 Coding in time domains

 Transformations between analog and digital

 Transformations can be used to present the

 Masking effect can also be used in coding

 MPEG audio uses Subband coding

 Compression rate / image quality can be selected

 Lossy modes use DCT for 8 x 8 pixel blocks

 The DCT-coefficients can be represented as a

 Uses either Huffman or arithmetic coding

 Lossless encoding utilizes prediction

 0,25 - 0,5 bpp: reasonable - good quality

You might also like