0% found this document useful (0 votes)
47 views5 pages

Lab 6 - Shazam Part II

data

Uploaded by

Dan Dinh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views5 pages

Lab 6 - Shazam Part II

data

Uploaded by

Dan Dinh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Lab 6: Shazam Part II

Dinh Bao Dan-V202200850

1. Introduction

The Shazam Beta version builds a simplified version of Shazam program: it pre-processes the
songs in the database, extracts the feature and builds a hash table. Then for any clip which we
want to identify, the Beta program follows the same procedures, matches in the hash table, and
identifies the clip by finding the similar features.

Now in the Shazam part II, we are trying to improve the Beta program such that it is more robust
against noise and more time efficient. For example, we may add some filters in the preprocessing
step and use some efficient data structure in the database.

2. Design Considerations

A. Time Performance:
Many algorithms have training and test parts. Training is the part you can run once and use it for
each test. In your code make_database creates a hashtable to save the information of all songs.
You don’t need to run it every time you call your main file. Try to save the hashtable variable as
a mat file and load instead.
IMPLEMENTATION:
Makedatabase.m
function hashTable = make_database(gs,deltaTL,deltaTU,deltaF)
table = [];
for i = 1:50
id = num2str(i);
toRead = strcat('songDatabaseShort/', num2str(i),'.mat');
tempTable = make_table(toRead,gs,deltaTL,deltaTU,deltaF,1);
songID = i.*(ones(length(tempTable),1));% For songID
table = [table; tempTable, songID];
hashTable = hash(table); %make database
end
save 'hashTable.mat' hashTable;
end

Main.m
%% Part I: Generate a data base
%hashTable = make_database(gs,deltaTL,deltaTU,deltaF); //I remove the
make_database step and load the processed database.
load('hashTable.mat');

Verification 1: I run the verification on all 50 songs and this is the result without load the
preprocessed database:
Page 1 of 2
With the preprocessed database:

B. Preprocessing: Filters
For pre-processing we can consider filters. For this lab, we added some high frequency Gaussian
noise into the first clip to obtain '1Noise.mat' in the folder 'songHighNoise'.

One possible filter to reduce the high frequency Gaussian noise could be implementing a low-
pass filter. Of course, there are many other filters that we could use, such as bandpass filter, etc.

Try to listen to the noisy sound using the sound(y, Fs) command. Try to filter the sound and play
again. Did you hear any improvements in the music you played?
IMPLEMENTATION:Adding low-pass filter
Lowpassfilter.m
function filter_clip = lowpassfilter(clip,Fs)
cutoff_frequency=2000;
wn= cutoff_frequency/(Fs/2);
[B,A]=butter(2,wn,'low');
filter_clip= filter(B,A,clip);
end

maketable.m
function table = make_table(song,gs,deltaTL,deltaTU,deltaF,mode)
myVars = {"y","Fs"};
clip= load(song, myVars{:});
% Step 1. Preprocessing
if (mode==2)
clip.y= lowpassfilter(clip.y,clip.Fs);
end
y=clip.y;
Fs=clip.Fs;
… with the rest keep the same
// maketable now have 2 modes, mode 1 is maketable for the song, mode 2 is
maketable for the clip and in mode 2, there is filter feature added

Page 2 of 2
Verification 2: The sound of highnoise files are actually reduced however, there still some noise
left so we should try another filter. We try notch filter and clearly hear less noise sound.
function filteredSong = notch_filter(input, samplingFreq)
input = input(:);
N = length(input);
startFreq = -samplingFreq/2;
endFreq = samplingFreq/2 - (2*samplingFreq/2)/N;
stepSize = (2*samplingFreq/2)/N;
freq = startFreq:stepSize:endFreq;
X = fftshift(fft(input))/N;

X_magnitude = abs(X(:));

r = 0.985;
peakDist = N * 0.0005;
threshold = 1.4;

[peakValues, peakLocations] = findpeaks(X_magnitude, 'SortStr', 'descend',


'MinPeakDistance', peakDist, 'NPeaks', 4);

if length(peakValues) >= 3
ratio = peakValues(1) / peakValues(3);
else
ratio = 0;
end

cutoff_frequency = 2000;
wn = cutoff_frequency / (samplingFreq / 2);
[B, A] = butter(2, wn, 'low');
input = filter(B, A, input);

if ratio > threshold


posPeak = freq(peakLocations(2));
if posPeak > 1000
filterFreq = (posPeak * 2 * pi) / samplingFreq;
a = [1, -2 * r * cos(filterFreq), r * r];
b = [1, -2 * cos(filterFreq), 1];
filteredSong = filter(b, a, input);
else
filteredSong = input;
end
else
filteredSong = input;
end
end

Page 3 of 2
C. Spectrogram: Window Size and overlap.
When we apply the spectrogram to the signal, we fix the window size and the overlap. The larger
the window size is, the fewer data we will have. The smaller overlap we set, the less data there
will be. Fewer data will surely improve the time performance of the code but it will also reduce
the accuracy of the matching.

We want our algorithm to work well on shorter clips. In the zip file songDataBaseShort.zip random
15 seconds clips from each song is stored. Try your algorithm on these shorter clips to see how is
the accuracy.

Verification 3: Try different combination of the window size and the overlap. Use the
combination you found that will make sure good accuracy and the data running time is short.
Do not forget to retrain your hash table when you change the parameters. If it takes a long time
you can change the number of songs to 10 from 50.
Implementation:
Original window size and number overlap:
window=64 * 10^-3 * new_Fs;
noverlap=32 * 10^-3 * new_Fs;
Most optimal try with
window=round(64* 10^-3 * new_Fs);
noverlap=round(8* 10^-3 * new_Fs);
nfft=round(64*10^-3*new_Fs);

Time run for 50 songs is reduced by 0.4s which is about 14% improvement
without effecting accuracy.
D. Feature Extraction: Spectrogram Local Peaks: Local Troughs and Window Size.
Previously, we used the local peaks (local maximum) to construct the boolean matrix
localPeakLocation. An alternative way is to use local troughs (local minimums).
IMPLEMENTATION:
%% Step 3. Feature Extraction
array = -floor(gs/2):floor(gs/2);
localPeakLocation = ones(size(log_S));

for i = 1:gs
for j = 1:gs
if (array(i) == 0 && array(j) == 0)
localPeakLocation = localPeakLocation;
else
CA = circshift(log_S,[array(i),array(j)]);
localPeakLocation = (log_S-CA < 0) .* localPeakLocation;
end
end
end

To extract the trough we only need to change “>” sign to “<” sign. With the original window size
Page 4 of 2
and noverlap we get the result:

This means that the speed increase a little bit when I chose to extract trough
instead of peaks.

E) Conclusion
In this lab, we have tried many ways to speed up the algorithm to find a song from its short
clip. At first we need to create a preprocessed database to avoid processing data again and again
when we run the program. Next, we tried the low pass filter to remove noise from the clip so that
we can analyze the clip easier and identify its original song. Lastly, we tried to switch from
extracting peaks to extracting troughs of the songs and this also gives us a better results in speed.

Page 5 of 2

You might also like