0% found this document useful (0 votes)

3 views17 pages

Chapter 1

The document provides an introduction to handling audio data in Python, covering various audio file formats and their frequency characteristics. It explains how to open audio files, convert sound wave bytes to integers, find frame rates, and visualize sound waves using libraries like NumPy and Matplotlib. Practical examples are included to demonstrate the process of working with audio data effectively.

Uploaded by

Andreu Orestes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views17 pages

Chapter 1

Uploaded by

Andreu Orestes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Introduction to audio

data in Python
SPOKEN LANGUAGE PROCESSING IN PYTHON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Dealing with audio files in Python
Different kinds all of audio files
mp3

wav

m4a

flac

Digital sounds measured in frequency (kHz)

1 kHz = 1000 pieces of information per second

SPOKEN LANGUAGE PROCESSING IN PYTHON

Frequency examples
Streaming songs have a frequency of 32 kHz

Audiobooks and spoken language are between 8 and 16 kHz

We can't see audio files so we have to transform them first

import wave

SPOKEN LANGUAGE PROCESSING IN PYTHON

Opening an audio file in Python
Audio file saved as good-morning.wav

# Import audio file as wave object

good_morning = wave.open("good-morning.wav", "r")

# Convert wave object to bytes

good_morning_soundwave = good_morning.readframes(-1)

# View the wav file in byte form

good_morning_soundwave

b'\xfd\xff\xfb\xff\xf8\xff\xf8\xff\xf7\...

SPOKEN LANGUAGE PROCESSING IN PYTHON

Working with audio is different
Have to convert the audio to something useful

Small sample of audio = large amount of information

SPOKEN LANGUAGE PROCESSING IN PYTHON

Let's practice!
SPOKEN LANGUAGE PROCESSING IN PYTHON
Converting sound
wave bytes to
integers
SPOKEN LANGUAGE PROCESSING IN PYTHON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Converting bytes to integers
Can't use bytes
Convert bytes to integers using numpy

import numpy as np
# Convert soundwave_gm from bytes to integers
signal_gm = np.frombuffer(soundwave_gm, dtype='int16')
# Show the first 10 items
signal_gm[:10]

array([ -3, -5, -8, -8, -9, -13, -8, -10, -9, -11], dtype=int16)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Finding the frame rate
Frequency (Hz) = length of wave object array/duration of audio file (seconds)

# Get the frame rate

framerate_gm = good_morning.getframerate()
# Show the frame rate
framerate_gm

48,000

Duration of audio file (seconds) = length of wave object array/frequency (Hz)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Finding sound wave timestamps
# Return evenly spaced values between start and stop
np.linspace(start=1, stop=10, num=10)

array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])

# Get the timestamps of the good morning sound wave

time_gm = np.linspace(start=0,
stop=len(soundwave_gm)/framerate_gm,
num=len(soundwave_gm))

SPOKEN LANGUAGE PROCESSING IN PYTHON

Finding sound wave timestamps
# View first 10 time stamps of good morning sound wave
time_gm[:10]

array([0.00000000e+00, 2.08334167e-05, 4.16668333e-05, 6.25002500e-05,

8.33336667e-05, 1.04167083e-04, 1.25000500e-04, 1.45833917e-04,
1.66667333e-04, 1.87500750e-04])

SPOKEN LANGUAGE PROCESSING IN PYTHON

Let's practice!
SPOKEN LANGUAGE PROCESSING IN PYTHON
Visualizing sound
waves
SPOKEN LANGUAGE PROCESSING IN PYTHON

Daniel Bourke
Machine Learning Engineer/YouTube
Creator
Adding another sound wave
New audio file: good_afternoon.wav
Both are 48 kHz

Same data transformations to all audio files

SPOKEN LANGUAGE PROCESSING IN PYTHON

Setting up a plot
import matplotlib.pyplot as plt
# Initialize figure and setup title
plt.title("Good Afternoon vs. Good Morning")
# x and y axis labels
plt.xlabel("Time (seconds)")
plt.ylabel("Amplitude")
# Add good morning and good afternoon values
plt.plot(time_ga, soundwave_ga, label ="Good Afternoon")
plt.plot(time_gm, soundwave_gm, label="Good Morning",
alpha=0.5)
# Create a legend and show our plot
plt.legend()
plt.show()

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON
Time to visualize!
SPOKEN LANGUAGE PROCESSING IN PYTHON

Pro Tools For Breakfast: Get Started Guide For The Most Used Software In Recording Studios: Stefano Tumiati, #2
From Everand
Pro Tools For Breakfast: Get Started Guide For The Most Used Software In Recording Studios: Stefano Tumiati, #2
Stefano Tumiati
No ratings yet
Steve Duke Mouthpiece Placement
No ratings yet
Steve Duke Mouthpiece Placement
3 pages
Spoken Language Processing in Python Chapter1
No ratings yet
Spoken Language Processing in Python Chapter1
17 pages
Pydub
No ratings yet
Pydub
26 pages
Lecture
No ratings yet
Lecture
7 pages
Ost PROJECT
No ratings yet
Ost PROJECT
8 pages
Chapter 1: Introduction To Audio Signal Processing: KH Wong
100% (1)
Chapter 1: Introduction To Audio Signal Processing: KH Wong
55 pages
Spoken Language Processing in Python Chapter2
No ratings yet
Spoken Language Processing in Python Chapter2
23 pages
Digital Signal Processing Report
No ratings yet
Digital Signal Processing Report
20 pages
Spoken Language Processing in Python Chapter3
No ratings yet
Spoken Language Processing in Python Chapter3
26 pages
Sec 5 - Audio Signal Acquisition - Record & Load mp3
No ratings yet
Sec 5 - Audio Signal Acquisition - Record & Load mp3
9 pages
Sec 4 - Audio Signal Acquisition - Read&Write Wave - Plot
No ratings yet
Sec 4 - Audio Signal Acquisition - Read&Write Wave - Plot
12 pages
Audproc 2
No ratings yet
Audproc 2
40 pages
Sound Processing
No ratings yet
Sound Processing
9 pages
DSP Lab
No ratings yet
DSP Lab
44 pages
Sound Processing
No ratings yet
Sound Processing
22 pages
PCP Notes Speech Processing Jan08
No ratings yet
PCP Notes Speech Processing Jan08
35 pages
Audio Analysis in Python 1676006837
No ratings yet
Audio Analysis in Python 1676006837
5 pages
Notes
No ratings yet
Notes
46 pages
UT Dallas Syllabus For hcs7367.501.06f Taught by Peter Assmann (Assmann)
No ratings yet
UT Dallas Syllabus For hcs7367.501.06f Taught by Peter Assmann (Assmann)
3 pages
Speech Recognition
No ratings yet
Speech Recognition
5 pages
Speech Recognition UTHM
No ratings yet
Speech Recognition UTHM
30 pages
Speech Understanding Content
No ratings yet
Speech Understanding Content
10 pages
Voice Assistant Report
No ratings yet
Voice Assistant Report
4 pages
Audio Noise Detection
No ratings yet
Audio Noise Detection
29 pages
Voice Processing Tool
No ratings yet
Voice Processing Tool
51 pages
AudioComm PE Vaibhav
No ratings yet
AudioComm PE Vaibhav
14 pages
Digital Signal Processing "Speech Recognition": Paper Presentation On
No ratings yet
Digital Signal Processing "Speech Recognition": Paper Presentation On
12 pages
Introduction (UCS749)
No ratings yet
Introduction (UCS749)
72 pages
Final Year Project Progress Report
No ratings yet
Final Year Project Progress Report
17 pages
Programming 1 Term 7 Lesson
No ratings yet
Programming 1 Term 7 Lesson
24 pages
Signal Lab 3,4 2 PDF
No ratings yet
Signal Lab 3,4 2 PDF
7 pages
Lecture 16
No ratings yet
Lecture 16
23 pages
ASP Lab Report
No ratings yet
ASP Lab Report
8 pages
Audio Processing Packages
No ratings yet
Audio Processing Packages
4 pages
Speaker Recognition Matlab
No ratings yet
Speaker Recognition Matlab
24 pages
Aryan Raj ASP Aat
No ratings yet
Aryan Raj ASP Aat
9 pages
Eng 6 Audio Signals: Bevan Baas, Andre Knoesen
No ratings yet
Eng 6 Audio Signals: Bevan Baas, Andre Knoesen
30 pages
DSP Project 2
No ratings yet
DSP Project 2
10 pages
UrbanSound8K Dataset: Automatic Sound Recognition (ASR) Project With CNN and ANN Models
No ratings yet
UrbanSound8K Dataset: Automatic Sound Recognition (ASR) Project With CNN and ANN Models
31 pages
Acoustics of Speech: Julia Hirschberg CS 4706
No ratings yet
Acoustics of Speech: Julia Hirschberg CS 4706
30 pages
Spoken Language Processing in Python Chapter4
No ratings yet
Spoken Language Processing in Python Chapter4
46 pages
Theory and Application of Digital Speech Processing by L. R. Rabiner and R. W. Schafer
No ratings yet
Theory and Application of Digital Speech Processing by L. R. Rabiner and R. W. Schafer
35 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
Lectures 7-8 Winter 2012
No ratings yet
Lectures 7-8 Winter 2012
73 pages
Audio Processing
No ratings yet
Audio Processing
19 pages
ECE3001Proj Part1
No ratings yet
ECE3001Proj Part1
2 pages
Homework 1
No ratings yet
Homework 1
3 pages
Speech Understanding Content
No ratings yet
Speech Understanding Content
9 pages
Speech Recognition (Dr. M. Sabarimalai Manikandan
No ratings yet
Speech Recognition (Dr. M. Sabarimalai Manikandan
2 pages
APPFDL
No ratings yet
APPFDL
9 pages
Basic Course Material Winter 2015
100% (1)
Basic Course Material Winter 2015
19 pages
Octave System Sound Processing Library: Lóránt Oroszlány
No ratings yet
Octave System Sound Processing Library: Lóránt Oroszlány
39 pages
Voice Assistant Report 40 Pages
No ratings yet
Voice Assistant Report 40 Pages
44 pages
4-2 Reading Wave Files: 本網頁根據 Chrome 測試，如果你不是使用 Chrome，可能無法正確呈現唷！
No ratings yet
4-2 Reading Wave Files: 本網頁根據 Chrome 測試，如果你不是使用 Chrome，可能無法正確呈現唷！
7 pages
Wavread in Matlab Sound File
No ratings yet
Wavread in Matlab Sound File
3 pages
CCS369 - TSS-Unit 5
No ratings yet
CCS369 - TSS-Unit 5
23 pages
UT Dallas Syllabus For hcs7367.501.09f Taught by Peter Assmann (Assmann)
No ratings yet
UT Dallas Syllabus For hcs7367.501.09f Taught by Peter Assmann (Assmann)
3 pages
Aml CT2 4M
No ratings yet
Aml CT2 4M
8 pages
The Impulse Response Bible
From Everand
The Impulse Response Bible
Past To Future
No ratings yet
Audio Manual for Podcasts: Learn Digital Audio Basics and Improve the Sound of your Podcasts: Stefano Tumiati, #4
From Everand
Audio Manual for Podcasts: Learn Digital Audio Basics and Improve the Sound of your Podcasts: Stefano Tumiati, #4
Stefano Tumiati
No ratings yet
NYU Essay
No ratings yet
NYU Essay
2 pages
4 Town - Nobody Like U - Violin Play Along Sheet Music
No ratings yet
4 Town - Nobody Like U - Violin Play Along Sheet Music
2 pages
Harlem Nocturne - 6 Horns + Rhythm - Ray Noble Band
100% (2)
Harlem Nocturne - 6 Horns + Rhythm - Ray Noble Band
27 pages
Song Composers (Tagalog Old Songs)
No ratings yet
Song Composers (Tagalog Old Songs)
11 pages
Whistlin Lesson Ideas
No ratings yet
Whistlin Lesson Ideas
1 page
Grade 9 Cas PP2
No ratings yet
Grade 9 Cas PP2
13 pages
Class 9 Assertion and Reason
No ratings yet
Class 9 Assertion and Reason
2 pages
Philippine Ethnic Musical Instruments
No ratings yet
Philippine Ethnic Musical Instruments
49 pages
EON15-G2: Technical Manual
100% (1)
EON15-G2: Technical Manual
21 pages
Art App 6
No ratings yet
Art App 6
13 pages
WPC 091007
No ratings yet
WPC 091007
59 pages
Gordon E. Couch Warm Up Series
No ratings yet
Gordon E. Couch Warm Up Series
2 pages
Checklist Section 3
No ratings yet
Checklist Section 3
4 pages
Staff
No ratings yet
Staff
15 pages
Sansui AU-X1 Service Manual1
No ratings yet
Sansui AU-X1 Service Manual1
20 pages
PunchBox - User Manual
No ratings yet
PunchBox - User Manual
55 pages
MAPEH Fetival of Talent
No ratings yet
MAPEH Fetival of Talent
3 pages
Science: Singapore Examinations and Assessment Board
No ratings yet
Science: Singapore Examinations and Assessment Board
44 pages
Lesson Plan About Music of China
No ratings yet
Lesson Plan About Music of China
2 pages
Manual: Powered by RPCX
No ratings yet
Manual: Powered by RPCX
33 pages
0625 w16 QP 42
No ratings yet
0625 w16 QP 42
20 pages
Assignment Idowu
No ratings yet
Assignment Idowu
5 pages
Topp Pro Music Gear: DMX 24.4 Digital Mixer Drive
No ratings yet
Topp Pro Music Gear: DMX 24.4 Digital Mixer Drive
2 pages
Architectural Acoustics Lecture 2023
No ratings yet
Architectural Acoustics Lecture 2023
79 pages
Physics Assignment 2020-2021 (XL SCI.) B.K.Birla College, Kalyan (Copy)
No ratings yet
Physics Assignment 2020-2021 (XL SCI.) B.K.Birla College, Kalyan (Copy)
14 pages
Denon 1650
No ratings yet
Denon 1650
1 page
Science Class 8 Sound and Hearing
No ratings yet
Science Class 8 Sound and Hearing
4 pages
Issues and Methods in Musicology 2010-2011
No ratings yet
Issues and Methods in Musicology 2010-2011
18 pages

Chapter 1

Uploaded by

Chapter 1

Uploaded by

Introduction to audio

Digital sounds measured in frequency (kHz)

SPOKEN LANGUAGE PROCESSING IN PYTHON

Audiobooks and spoken language are between 8 and 16 kHz

We can't see audio files so we have to transform them first

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Import audio file as wave object

# Convert wave object to bytes

# View the wav file in byte form

SPOKEN LANGUAGE PROCESSING IN PYTHON

Small sample of audio = large amount of information

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Get the frame rate

Duration of audio file (seconds) = length of wave object array/frequency (Hz)

SPOKEN LANGUAGE PROCESSING IN PYTHON

# Get the timestamps of the good morning sound wave

SPOKEN LANGUAGE PROCESSING IN PYTHON

array([0.00000000e+00, 2.08334167e-05, 4.16668333e-05, 6.25002500e-05,

SPOKEN LANGUAGE PROCESSING IN PYTHON

Same data transformations to all audio files

SPOKEN LANGUAGE PROCESSING IN PYTHON

SPOKEN LANGUAGE PROCESSING IN PYTHON

You might also like