0% found this document useful (0 votes)

65 views48 pages

SSP 5 3 Music Coding

The document discusses how sound is processed in an MP3 player. It involves separating the sound into frequency bands using a filter bank. This allows exploiting properties of human perception like masking to reduce file size by quantizing different frequency bands individually and placing quantization noise below masking thresholds so it cannot be heard. The key aspects are the filter bank design and its properties, which split the signal into subbands that can then be quantized and processed individually.

Uploaded by

Gabor Gereb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views48 pages

SSP 5 3 Music Coding

Uploaded by

Gabor Gereb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

How is sound processed in an mp3 player?

Sverre Holm
Overview
Part I
• Perceptual coding requires separation in frequency bands
• 2-channel filter bank
– Approach 1: Design individual filters for minimal distortion: aliasing, imaging
– Approach 2: Design filter pairs for distortion cancellation (much better)

Part II
• Quadrature mirror filter bank design
– Filter bank without quantization, SNR = 84 dB
• 32-channel filter bank – violin
– Uniform quantization, SNR = 10 dB
– Uniform adaptive quantization, SNR = 25 dB
– Perceptual quantization, SNR = 39 dB

27.04.2022 2
Principles
• Vocodere
– ”Robotic voice”
• Hybrid coders Speech Music
– Mobile phone
• Waveform coders (Music)
– MPEG-1 (1993)
• Layer 2 = mp2 = DAB
• Layer 3 = mp3
– MPEG-2
• AAC (Advanced Audio Coder,
1999): iTunes, Tidal
– MPEG-4
• HE-AAC (High-efficiency AAC,
2004): for low bit rates: DAB+
MPEG = International Standardization Organization
(ISO) Moving Picture Experts Group

27.04.2022 3
Cover the center!

27.04.2022 5
Intro to perceptional coding
• Repetition from IN3190 / INF3470

27.04.2022 7
The frequency filters of the ear:
Mapping frequency to a location

Unwound
cochlea

27.04.2022 8
Bandpass filters are essential

The Bark scale: "...a frequency scale on which equal distances

correspond with perceptually equal distances. Above about 500
Hz this scale is more or less equal to a logarithmic frequency axis.
Below 500 Hz the Bark scale becomes more and more linear.
Figures: Evangelista, G., Dörfler, M., & Matusiak, E. (2013). Arbitrary phase vocoders by
means of warping. Musica/Tecnologia, 7, 91-118.
27.04.2022 9
Filterbank Approach. Encoding

BIT STREAM
Scaling Quantize

FILTERBANK
Scaling Quantize BIT
STREAM
LP-

MUX
Source
filter
A/D

Scaling Quantize

PSYCHOACOUSTIC BIT
ANALYSIS ALLOCATION

27.04.2022 10
Decoding is much simpler

Scaling

FILTERBANK
BIT
Scaling
STREAM
LP-
S D/A filter

Scaling

27.04.2022 11
What is this Psychoacoustics that is
used in the Encoder ?

27.04.2022 12
Masking
We do not hear all sounds.
1. Absolute threshold of hearing: !
2. Masking: One sound is inaudible in the presence
of another sound.
1. Simultaneous masking
– Noise Masking Tone
– Tone Masking Noise
– Noise Masking Noise
2. Nonsimultaneous masking
• Pre masking (2 ms)
• Post masking (100 ms)
27.04.2022 13
Noise Masking Tone
Filtered Noise Tone 1, 820 Hz Tone 2, 410 Hz Noise Noise
Center 410 Hz 5 dB below noise 5 dB below noise + +
Width 111 Hz Tone 1 Tone 2

Not masked Masked

You can not hear a sinusoid that lies in

the same critical band as a filtered noise if the
sound pressure level is below a certain threshold.

This effect also stretches out beyond the critical

band.
14
Tone Masking Noise
Filtered Noise Tone 1, 2 kHz Tone 2, 1 kHz Noise Noise
Center 1 kHz + +
Width 162 Hz Tone 1 Tone 2
15 dB below
Not masked Masked

You can not hear a filtered noise that lies in

the same critical band as a sinusoid if the sound
pressure level is below a certain threshold.

This effect also stretches out beyond the critical

band.
27.04.2022 15
Exploit Masking
• If a sound is masked we
cannot hear it.

• Frequency analysis of
the signal to find the masking threshold.
• Put the quantization noise under the masking
threshold and we won’t hear the effect of
quantization.

27.04.2022 16
Processing in an mp3 player

• The frequency domain is the key to exploiting

perceptual properties
• Implies a split of the signal into frequency
bands and individual quantization per band
• Focus of this lecture is therefore on the filter
bank, its properties and its design

27.04.2022 17
Subband coder
BP filtering, downsampling, quantization, upsampling, BP, sum

27.04.2022 18
Image

ASP_mp3.m: Upwards chirp 4000

3500
Image
Down/up-sampling (repeat IN3190)
3000

2500

Frequency
2000

1500
Generated
1. Channel 0: No h0, no g0 by alias
1000

500

• Aliasing distortion due to downsampling,

0.5 1 1.5 2 2.5 3 3.5
Time

image distortion due to upsampling 4000

3500

2. Channel 0: Insert g0
3000

2500

Frequency
2000

• Images disappear 1500

1000

3. Channel 0: Use both h0 and g0

500

0
0.5 1 1.5 2 2.5 3 3.5
Time

» Aliasing in 2nd part gone. No sampling distortion

4000

Throw out 50% ; replace by 0’s 3500

3000

2500

Frequency
2000

1500

1000

500

0
0.5 1 1.5 2 2.5 3 3.5
Time

27.04.2022 19
2-filter bank and individual
optimization of each filter
• Approach of first examples with chirp
• Three kinds of distortion:
– Amplitude distortion (in pass band)
– Phase distortion (in pass band)
– Aliasing distortion due to stop band leakage
0.3

• Here shown as error signal 0.2

in transition between filters:

0.1

• Requires ideal filters to

-0.1

-0.2

avoid this distortion -0.3

-0.4
0 0.5 1 1.5 2 2.5 3 3.5
4
x 10
27.04.2022 20
Perfect Reconstruction
• Up to now: optimization of each filter branch
independently
• Instead: Accept some aliasing per sub-band,
provided that aliasing from adjacent bands
cancel in the final summation
• Ilustrated by pairs of filters

27.04.2022 21
Filterbank

Near-ideal low pass filter Aliased low/high-pass

Best!
10
H0(f)
0 H1(f)

-10

-20

Magnitude (dB)
-30

-40

-50

-60

-70

-80
0 0.2 0.4 0.6 0.8 1
Normalized frequency (*pi rad/sample)

27.04.2022 22
Overview
Part I
• Perceptual coding requires separation in frequency bands
• 2-channel filter bank
– Approach 1: Design individual filters for minimal distortion: aliasing, imaging
– Approach 2 (better): Design filter pairs for distortion cancellation

Part II
• Quadrature mirror filter bank design
– Filter bank without quantization: SNR = 84.2 dB
• 32-channel filter bank – Violin example
– Uniform quantization, Uniform adaptive quantization, Perceptual quantization
(SNR = 10.3, 25, 38.6 dB)

27.04.2022 23
Quadrature mirror expression of up-
sampling of sub-band signal

= xi(n) for n even

= 0, for n odd

27.04.2022 24
Aliasing in decimation-interpolation
• Aliasing interpreted in the framework of quadrature mirrors

10
H0(f)
0 H1(f)

-10

-20

Magnitude (dB)
-30

-40

-50

-60

-70

-80
0 0.2 0.4 0.6 0.8 1
Normalized frequency (*pi rad/sample)

27.04.2022 25
Output of filter bank

 from down-/up-sampling

27.04.2022 26
Include input filters

• Perfect reconstruction: Out = In, x~(n) = x(n), if

Aliasing:

27.04.2022 27
Aliasing cancellation
• H = input filters (anti-aliasing)
• G = output filters (anti-imaging) coupled to H:
2. condition  cancel aliasing
= h1(n)(-1)n

= -h0(n)(-1)n
1. condition  unity gain

27.04.2022 28
Quadrature Mirror Filters
- Unity gain constraint
- Couple h0 with h1

All four filters

are now coupled

27.04.2022 29
ASP_mp3.m
• Sections 1-2
• May use rather poor LP, HP filters
• Despite aliasing in each filter, almost perfect
reconstruction due to near perfect
cancellation of aliasing errors

27.04.2022 30
Frequency-selective quadrature mirror
filters?

- Oh no!

- Must have frequency-selective filter,

otherwise the whole point of
perceptual coding is lost
- Impossible to satisfy (3.13) if also
frequency-selective!
- In practice allow some amplitude
27.04.2022
distortion 31
Pseudo Quadrature Mirror Filters

M=2:
f0=1/8, f1=3/8

27.04.2022 32
ASP_mp3.m
• Section 3:
• Build 32 PQMF
• Listen to band 3 – lots of aliasing error
• Listen to reconstruction (no quantization)
– snr_PQMF = 84.2 dB
– For all practical purposes aliasing error has been
cancelled

27.04.2022 35
Violin spectrogram
4
x 10

What is this spectral line?

1.5
Frequency

0.5

0
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Time

Morse Code hidden in Mike Oldfield's Tubular Bells.

27.04.2022 36
0.4
Sub-band #2, quantized
0.3 Sub-band#2, original

0.2

0.1

ASP_mp3.m

Amplitude
0

-0.1

-0.2

-0.3

• Section 5: back to Pseudo QMF -0.4

0 20 40 60 80
Time (samples at Fs/32)
100 120

• 4-bit (+/-8 levels) uniform quantizer per band

– snr_4bits = 10.3 dB – very poor!
– strong high frequency tonal noise at each sub-band
transition i.e. every 1378 Hz (aliasing)
– Effectively only +/- 2 levels in e.g. sub-band 2 since
amplitude < +/-0.25 and quantizer spans +/- 1 0.4
Sub-band #2, quantized

• Adaptive quantizer per band 0.3

0.2
Sub-band#2, original

– snr_4bits_scaled = 25.0 dB 0.1

Amplitude
0

-0.1

-0.2

-0.3

-0.4
27.04.2022 0 20 40 60 80
Time (samples at Fs/32)
100 120
38
Adaptive quantization in layer 1
• The max value of the quantizer is adapted to
the actual max level that exists  automatic
gain control

27.04.2022 39
ASP_mp3.m
• Perceptual bit allocation also
• Max of global threshold and the absolute
auditory threshold is the final threshold.
• Signal-to-mask thresholds per band
• snr_scaled_perceptual = 38.6 dB
• bit_rate = 192.25 kbps

27.04.2022 40
Perceptual allocation of bits

100
Signal PSD
Min. threshold per sub-band
80 Absolute threshold

60
Magnitude (dB)

-20
0 0.5 1 1.5 2
Frequency (Hz) 4
x 10

27.04.2022 41
Spectrum of signal and error
4-bit adaptive Q per band (~192 192 kbps perceptual coding,
kbps) SNR = 25.0 dB SNR = 38.6 dB

0 0
Signal PSD Signal PSD
Error PSD Error PSD
-20 -20

-40 -40

Magnitude (dB)
Magnitude (dB)

-60 -60

-80 -80

-100 -100

-120 -120
0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5
Frequency (Hz) 4 Frequency (Hz) 4
x 10 x 10

Important Not so important

27.04.2022 42
Masking is level-dependent
Left: Absolute auditory threshold

Right: Masking due to a tone at 1kHz for various intensities

Ex: A pure tone at 1 kHz @ 80 dB SPL makes
another tone at 2 kHz @ 40 dB SPL inaudible
27.04.2022 43
Critical bands: not uniformly spaced

• Bark scale

27.04.2022 45
Deviation from auditory process
• Edges of critical bands: not uniformly spaced, follow Bark scale
– But most sub-band coders have equal bandwidth channels
Former NTNU – Ramstad, Tor A., and Joar P. Tanem. "Cosine-modulated analysis-
colleagues
synthesis filterbank with critical sampling and perfect reconstruction." IEEE
Int Conf Acoust, Speech, Sign Proc,1991.
• The authors present a simple derivation of a parallel filterbank based on cosine-
modulated versions of a model low-pass filter. With a nonuniform channel
separation an efficient implementation consisting of a DFT (discrete Fourier
transform) related transform and subfilters is possible. Using critical sampling of
each channel and FIR (finite impulse response) filters, the conditions for perfect
reconstruction are given. The computational complexity of the derived FIR
filterbank is much lower than for a tree-structured FIR filterbank but cannot
compete with the most efficient IIR filterbanks.
• Masking depends on absolute loudness, but level is unknown
– Assumes that the least significant bit of the 16-bit signal can just be heard

27.04.2022 46
MPEG-1
• CD 2 x 16 bit/sample x 44.1 ksamples/sec =
1.4 Mbps
• Layer 1: 1:4 compression (down to 384 kbps)
• Layer 2 (mp2): 1:6-8 (192 – 256 kbps)
– Original DAB coder.
– DAB uses down to 128 kbps (Norway), even
lower in the UK
• Layer 3 (mp3): 1:10-12 (112-128 kbps, rather
poor quality mp3!)
27.04.2022 47
MPEG-1 Layer 1
• 32 channel filterbanks
• Frame size 384 samples, decimated to 384/32=
12 samples
• 512 pt FFT for local power estimate as input to
psychoacoustic model
• SMR=Signal-to-mask ratio determined
• Uniform quantization of sub-bands so that
SNR<SMR
• 0 = bits for bands 27-32
• Adaptive quantizer, normalized over 12 samples
27.04.2022 48
MPEG-1 Layer-II (mp2, DAB)
• Longer FFT for spectral estimation (1024
samples)
• Scale factors from groups of three blocks
• Also temporal in addition to spectral masking

27.04.2022 49
MPEG-1 Layer 3
• 32 original sub-bands are split into 18
channels by Modified Discrete Cosine
Transform when needed – better frequency
resolution
• Dynamic coding scheme – more bits to
frames that require it

27.04.2022 50
MPEG-2 Advanced Audio Coding
(AAC)
• More sampling rates (down to 8 kHz)
• 5.1 surround
• AAC (1997) improves mp3
• Single filter bank with adaptive block size
– 1024 samples for stationary sounds
– 128 samples for transient sounds
• Apples iTunes
• Wimp (Tidal)
27.04.2022 51
MPEG-4 AAC
• MPEG-4 HE AAC (High efficiency)
– Sub-band replication (SBR), exploits correlation
between LF (<4-8 kHz) and HF to reproduce HF
– Claim stereo transparency at 48 kbps (?)
– My measurements: very poor stereo at 48 kbps
• MPEG-4 HE AAC v2 (AAC+) 2004
– Also pseudo stereo (PS)
– Claim stereo transparency at 24 kbps (?)
– DAB+ in Norway
27.04.2022 52
Overview
Part I
• Perceptual coding requires separation in frequency bands
• 2-channel filter bank
– Approach 1: Design individual filters for minimal distortion: aliasing, imaging
– Approach 2: Design filter pairs for distortion cancellation (much better)

Part II
• Quadrature mirror filter bank design
– Filter bank without quantization, SNR = 84.2 dB
• 32-channel filter bank – violin
– Uniform quantization, SNR = 10.3 dB
– Uniform adaptive quantization, SNR = 25 dB
– Perceptual quantization, SNR = 38.6 dB

27.04.2022 53
Audio coding in practice
• Raw uncoded 44.1 kHz/16 bit: 1411 kbps
• Mobile (mono, speech): 12.2 kbps
• Radio - DAB+, HE-AAC: 48-96 kbps
1. Mono, low bandwidth: Weather, News
2.
3.
Most stations: average
Niche channelse: >FM at its best: Classical, often P2
Mobile
• iTunes before 2007, AAC: 128 kbps comm.
• iTunes+ today, AAC: 256 kbps
• Spotify (Vorbis), Tidal (AAC): 320 kbps
• Tidal Hi-Fi, lossless FLAC: ~1000 kbps
– Free Lossless Audio Codec, 50-80% av CD
• 96-192 kHz, 24 bit stereo: 4600-9200

• Video lecture: The Good, the Bad, the Ugly. Fra vinyl til høyoppløst digitallyd
(2020)

27.04.2022 54

Advanced Audio Coding (Aac)
100% (1)
Advanced Audio Coding (Aac)
33 pages
36-Perceptual Coding, MPEG Audio Coding-03!04!2025
No ratings yet
36-Perceptual Coding, MPEG Audio Coding-03!04!2025
57 pages
6 - Digital Audio Technology
No ratings yet
6 - Digital Audio Technology
24 pages
MEH-Nakai Lab-1
No ratings yet
MEH-Nakai Lab-1
93 pages
Stress at Work
No ratings yet
Stress at Work
4 pages
Subband Coding
No ratings yet
Subband Coding
12 pages
Audio Player
No ratings yet
Audio Player
8 pages
Chapter 06 - Basics of Digital Audio
No ratings yet
Chapter 06 - Basics of Digital Audio
97 pages
Audio Compression1
No ratings yet
Audio Compression1
22 pages
CH 3
No ratings yet
CH 3
71 pages
Lect03 Audio Representation
No ratings yet
Lect03 Audio Representation
34 pages
Audio Compression Techniques
No ratings yet
Audio Compression Techniques
22 pages
04 Digital Audio - Nuts and Bolts
No ratings yet
04 Digital Audio - Nuts and Bolts
52 pages
MPEG
No ratings yet
MPEG
12 pages
DSP Lab 5 Merged
No ratings yet
DSP Lab 5 Merged
16 pages
Chapter 3
No ratings yet
Chapter 3
23 pages
4 Chapter Audio and Video Compression
No ratings yet
4 Chapter Audio and Video Compression
122 pages
Application Domain
No ratings yet
Application Domain
4 pages
MPEG-4 Advanced Audio Coding
No ratings yet
MPEG-4 Advanced Audio Coding
13 pages
Lab 4a 1D Filtering in Frequency Domain
No ratings yet
Lab 4a 1D Filtering in Frequency Domain
11 pages
MP3 Format
No ratings yet
MP3 Format
25 pages
Audio Compression
No ratings yet
Audio Compression
23 pages
Chapter 3
No ratings yet
Chapter 3
27 pages
DSP Lab 3RT
No ratings yet
DSP Lab 3RT
4 pages
Audio Compression: Usha Sree
No ratings yet
Audio Compression: Usha Sree
23 pages
LPC, Which Has Mathematically Tractable and Well-Understood Model. This Model Is
No ratings yet
LPC, Which Has Mathematically Tractable and Well-Understood Model. This Model Is
14 pages
M1L1
No ratings yet
M1L1
14 pages
Audiosignalprocessing
No ratings yet
Audiosignalprocessing
11 pages
CS3570 Chapter4
No ratings yet
CS3570 Chapter4
71 pages
DSP Case Study Presentation
No ratings yet
DSP Case Study Presentation
24 pages
MPEG, The MP3 Standard, and Audio Compression
No ratings yet
MPEG, The MP3 Standard, and Audio Compression
12 pages
Audio Coding For TV
No ratings yet
Audio Coding For TV
36 pages
Dereje Teferi (PHD) Dereje - Teferi@Aau - Edu.Et
No ratings yet
Dereje Teferi (PHD) Dereje - Teferi@Aau - Edu.Et
30 pages
Suband Coding in Matlab
0% (1)
Suband Coding in Matlab
5 pages
Audio Compression
No ratings yet
Audio Compression
11 pages
Digital Representation of Audio Information
No ratings yet
Digital Representation of Audio Information
22 pages
Lecture 11 Sound Notes
No ratings yet
Lecture 11 Sound Notes
14 pages
Audio Compression
No ratings yet
Audio Compression
53 pages
Audio Compression
No ratings yet
Audio Compression
81 pages
424 Handout22 Filters4 LectureNotes
No ratings yet
424 Handout22 Filters4 LectureNotes
29 pages
MP3 Format: Theory of The Standard
No ratings yet
MP3 Format: Theory of The Standard
15 pages
Audio Coding: S. R. M. Prasanna
No ratings yet
Audio Coding: S. R. M. Prasanna
15 pages
DSP Lab 7 Manual
0% (1)
DSP Lab 7 Manual
10 pages
Bab 7 Multimedia Kompresi Audio
No ratings yet
Bab 7 Multimedia Kompresi Audio
52 pages
Fundamentals of Perceptual Audio Coding
No ratings yet
Fundamentals of Perceptual Audio Coding
30 pages
Low Bit Rate Coding
No ratings yet
Low Bit Rate Coding
4 pages
Multimedia Systems: Sreeraj K. P. Asst. Professor, Dec, Rset
No ratings yet
Multimedia Systems: Sreeraj K. P. Asst. Professor, Dec, Rset
27 pages
Quantization: Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin
No ratings yet
Quantization: Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin
19 pages
Chapter 3 Part 3
No ratings yet
Chapter 3 Part 3
62 pages
Topic 3 - Source Coding
No ratings yet
Topic 3 - Source Coding
65 pages
MATLAB For Audio Signal Processing: P. Professorson UT Arlington Night School
No ratings yet
MATLAB For Audio Signal Processing: P. Professorson UT Arlington Night School
27 pages
MPEG Standards For Audio
No ratings yet
MPEG Standards For Audio
46 pages
Digital Audio Processing Revisited: Juan P Bello
No ratings yet
Digital Audio Processing Revisited: Juan P Bello
29 pages
Digital Audio: Summary: Sources
No ratings yet
Digital Audio: Summary: Sources
14 pages
4 4 Addition and Subtraction of Polynomials
100% (1)
4 4 Addition and Subtraction of Polynomials
18 pages
Digital Audio Coding - Dr. T. Collins: Standard MIDI Files Perceptual Audio Coding MPEG-1 Layers 1, 2 & 3 MPEG-4
No ratings yet
Digital Audio Coding - Dr. T. Collins: Standard MIDI Files Perceptual Audio Coding MPEG-1 Layers 1, 2 & 3 MPEG-4
23 pages
Intro Class 3
100% (1)
Intro Class 3
38 pages
Pre-Analysis: Example: Steady One-Dimensional Heat Conduction in A Bar
No ratings yet
Pre-Analysis: Example: Steady One-Dimensional Heat Conduction in A Bar
12 pages
Computer Graphics Lab Manual: MR - Shivakumar B, Lecturer, Dept of BCA, SSIBM, Tumakuru
No ratings yet
Computer Graphics Lab Manual: MR - Shivakumar B, Lecturer, Dept of BCA, SSIBM, Tumakuru
21 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
Simple Audio Compression Methods: A Udio Com Pression
No ratings yet
Simple Audio Compression Methods: A Udio Com Pression
6 pages
Reading 3 Machine Learning
No ratings yet
Reading 3 Machine Learning
9 pages
MATLAB Audio Processing Ho
No ratings yet
MATLAB Audio Processing Ho
7 pages
Cyclic Codes - Detailed Study Notes
No ratings yet
Cyclic Codes - Detailed Study Notes
24 pages
BTP Report
No ratings yet
BTP Report
63 pages
Ece113 Lec03 Nonlinear Distortion
No ratings yet
Ece113 Lec03 Nonlinear Distortion
37 pages
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
No ratings yet
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
20 pages
Sales Forecasting Using Machine Learning
No ratings yet
Sales Forecasting Using Machine Learning
10 pages
Chapter-3 M Techniqes
No ratings yet
Chapter-3 M Techniqes
12 pages
SSP-5-1 Adaptive Filtering
No ratings yet
SSP-5-1 Adaptive Filtering
88 pages
INT247 Lect3.03.1
No ratings yet
INT247 Lect3.03.1
23 pages
Adarsh - 2024en01 - Soft Computing Assignment 1
No ratings yet
Adarsh - 2024en01 - Soft Computing Assignment 1
12 pages
Huff Man 1
No ratings yet
Huff Man 1
4 pages
XG Boosting Reference
No ratings yet
XG Boosting Reference
6 pages
SSP 4 2 - Modelling 2
No ratings yet
SSP 4 2 - Modelling 2
34 pages
Linearization
No ratings yet
Linearization
23 pages
Performance - Evaluation - of - Recurrent - Neural - Networks-LSTM - and - GRU - For ASR - IC2E3
No ratings yet
Performance - Evaluation - of - Recurrent - Neural - Networks-LSTM - and - GRU - For ASR - IC2E3
6 pages
AI Othello: Mick G.D. Remmerswaal April 23, 2020
No ratings yet
AI Othello: Mick G.D. Remmerswaal April 23, 2020
35 pages
SSP 4 1 - Modelling 1
No ratings yet
SSP 4 1 - Modelling 1
26 pages
SSP 3 2 - Spectrum 2
No ratings yet
SSP 3 2 - Spectrum 2
13 pages
Finite Impulse Response (FIR) Filter: Dr. Dur-e-Shahwar Kundi Lec-7
No ratings yet
Finite Impulse Response (FIR) Filter: Dr. Dur-e-Shahwar Kundi Lec-7
37 pages
Electronic DL 1
No ratings yet
Electronic DL 1
17 pages
Sound Synthesis Methods
100% (1)
Sound Synthesis Methods
8 pages
14-Solving Fem Equations
No ratings yet
14-Solving Fem Equations
14 pages
SSP 0 1 Introduction
No ratings yet
SSP 0 1 Introduction
5 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
9 pages
Advanced Seismic Data Analysis Presentation On: Deconvolution
No ratings yet
Advanced Seismic Data Analysis Presentation On: Deconvolution
13 pages
Iris Species IB
No ratings yet
Iris Species IB
7 pages
1.2.1. Machine Learning
No ratings yet
1.2.1. Machine Learning
2 pages
Assignment - 4
No ratings yet
Assignment - 4
2 pages
Sahinidis2019 Article Mixed-IntegerNonlinearProgramm
No ratings yet
Sahinidis2019 Article Mixed-IntegerNonlinearProgramm
6 pages
DSP - LP
No ratings yet
DSP - LP
4 pages
Linked List 2
No ratings yet
Linked List 2
6 pages
The Music Producer's Guide To Compression: The Music Producer's Guide
From Everand
The Music Producer's Guide To Compression: The Music Producer's Guide
Ashley Hewitt
No ratings yet
Noise Reduction: Enhancing Clarity, Advanced Techniques for Noise Reduction in Computer Vision
From Everand
Noise Reduction: Enhancing Clarity, Advanced Techniques for Noise Reduction in Computer Vision
Fouad Sabry
No ratings yet

SSP 5 3 Music Coding

Uploaded by

SSP 5 3 Music Coding

Uploaded by

How is sound processed in an mp3 player?

The Bark scale: "...a frequency scale on which equal distances

Not masked Masked

You can not hear a sinusoid that lies in

This effect also stretches out beyond the critical

You can not hear a filtered noise that lies in

This effect also stretches out beyond the critical

• The frequency domain is the key to exploiting

ASP_mp3.m: Upwards chirp 4000

• Aliasing distortion due to downsampling,

image distortion due to upsampling 4000

• Images disappear 1500

3. Channel 0: Use both h0 and g0

» Aliasing in 2nd part gone. No sampling distortion

Throw out 50% ; replace by 0’s 3500

• Here shown as error signal 0.2

in transition between filters:

• Requires ideal filters to

avoid this distortion -0.3

Near-ideal low pass filter Aliased low/high-pass

= xi(n) for n even

• Perfect reconstruction: Out = In, x~(n) = x(n), if

All four filters

- Must have frequency-selective filter,

What is this spectral line?

Morse Code hidden in Mike Oldfield's Tubular Bells.

• Section 5: back to Pseudo QMF -0.4

• 4-bit (+/-8 levels) uniform quantizer per band

• Adaptive quantizer per band 0.3

– snr_4bits_scaled = 25.0 dB 0.1

Important Not so important

Right: Masking due to a tone at 1kHz for various intensities

You might also like