0% found this document useful (0 votes)

99 views5 pages

Audio Splicing Detection Using Convolutional Neural Network

The document discusses a method for detecting audio splicing using a convolutional neural network. Audio slices of 1, 2, and 3 seconds are inserted into the middle of test audio clips. The CNN is fed the spectrogram of the audio. Results show the 3 second slice insertion has the highest detection accuracy at 96.67%, while the 1 second slice is lowest at 82.80%. The method is also robust to audio compression and noise.

Uploaded by

nomad def

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views5 pages

Audio Splicing Detection Using Convolutional Neural Network

Uploaded by

nomad def

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

IEEE - 45670

Audio Splicing Detection using Convolutional

Neural Network
Shital Jadhav Rashmika Patole Priti Rege
Department of Electronics Department of Electronic Department of Electronics
and Telecommunication, and Telecommunication, and Telecommunication,
College of Engineering Pune College of Engineering Pune College of Engineering Pune
[email protected] [email protected] [email protected]

Abstract—In an audio forensics scenario includes audio au- scene classification[13]. Up to now, less research is done in
thentication in which major investigation topic is audio tampering the field of audio tampering detection using deep learning.
detection. In this paper, we present a novel method of splicing According to our knowledge here CNN is used for the first
detection using a convolutional neural network. As high-level
features of audio are effectively estimated by convolutional neural time for splicing detection.
network, the frequency spectrogram of audio is directly fed as an The rest of paper is organized as follows: Section 2 is related
input to the convolutional neural network. The proposed work work of the audio authentication. Section 3 discuss proposed
uses 10-13 sec of an anechoic audio signal of an audio slice of work which gives the model of CNN. Section 4 gives details
1sec, 2sec, and 3sec insertion at the middle of the audio. Results of the experimental setup and results carried out. Future work
show that the insertion of 3 sec part gives better accuracy than
the other two parts. Whereas slice_1 part insertion gives 82.80% and conclusion are presented in section 5
accuracy, slice_2 part gives 87.54% accuracy, and slice_3 part II. RELATED WORK
gives 96.67% accuracy. Also, the proposed method is robust to
the audio compression as well as to the Additive White Gaussian Till now various methods are used in forensics for audio
Noise. authentication. They are divided into two techniques:
Index Terms—Audio Authentication, Splicing Detection, Tam- (i) Passive techniques which focuses on audio authentication
pering Detection, Convolutional Neural Network, Spectrogram,
STFT. using signal and its properties.
(ii) Active techniques contains watermarking as extra infor-
mation in audio[2].
I. INTRODUCTION
By passive techniques, an abrupt change in audio can
Audio authentication is a precursory venture in the area be observed that’s why passive techniques are preferable.
of audio forensics when an audio clip is used as a piece Watermarking can be absurd, as forgeries can be made without
of evidence. Nowadays due to furtherance in technologies change in watermarks embedded in audio.
alteration audio is an easy. Tampering in an audio made by Authentication may also include the recording device iden-
splicing, deletion, copy-move operation [8][9]. For Forgery, tification. Different recording device have divergent signature
splicing is thrusting from an audio at the start, middle or at which is useful to verify recording location and eventually
the end of another audio [8][7][15]. For manipulation, such determine the recording ownership [1]. Any recording device
kind of tampering may be present in audio, so detection of does not contain an ideal voltage regulator, so audio recorded
such splicing is worthwhile in audio forensics. in that device gives trace in an audio i.e an electric network
In previous work of audio splicing detection, features are frequency signal which is remnant of the main signal. As
extracted such as Electric Network Frequency (ENF), Lin- ENF vary steadily which is centered around 50Hz to 60Hz,
ear Prediction Coefficient (LPC), Mel Frequency Cepstral in the dense area there is tight control on ENF fluctuation so
Coefficient (MFCC), Decay Rate Parameter (reverberation determine recording location and tampering detection ENF is
parameters). Fickleness in this features are an indication of extracted from an audio and compare with the database of
tampering. Calculated features are then applied to classifier ENF values[6].
such as Support Vector Machine (SVM), Hidden Markov The properties extracted from audio is different for different
Model(HMM), Gaussian Mixture Model (GMM) for detection audio recording environment, so for authentication, identifica-
of tampering [8][12] . Deep learning is not yet used in the field tion of the environment is useful. By using Room Impulse
of forgery detection. Response (RIR), reverberation component identification of
In image forensics, supplemental growth is done on images environment can be possible[10][11]. The presence of sound
and video using CNN such as face identification and anomaly after source terminates is reverberation, so recorded audio
detection. In audio forensics, for speaker identification, speech consists of direct sound, early reverberation and late reverber-
recognition, more work has been completed and some work ation that’s why temporal and spectral smearing is present in
is completed in audio recapture detection [6][5] and acoustic recorded audio[4]. According to room geometry reverberation

10th ICCCNT 2019

July 6-8, 2019 , IIT - Kanpur,
Kanpur, India
IEEE - 45670

is different for different room. Compression does not affect B. SPROPOSED NETWORK ARCHITECTURE
reverberation parameter present in an audio [12].
Local noise level present in a shows a different structure
The convolutional neural network is modified version of the
when an audio has been tampered. This structure difference
neural network which comes under deep learning. In audio
or abnormalities in the local noise level is estimated using
CNN can be used for speech recognition, speaker identifi-
kurtosis. Local noise estimator does not require knowledge of
cation, and music information retrieval. CNN has multiple
file format and recording device which is good for tampering
layers with lots of neurons which adjust weights and bias
detection [3].
among themselves. The sequence of layers in CNN transfer
Presence of two or more ENF, MFCC and DRD, local noise
values from one layer to another layer through an activation
level present in an audio is an indication of forgery [12].Sud-
function. We have used 11 layers of CNN architecture with
den changes in phase is also an indication of tampering [1],
2 convolutional layers, 2 batch normalization layers, 2 ReLU,
so audio authentication includes a frequency spectrogram test
and 1 max pooling layer.
which notices changes in phase. Instead of explicitly extracting
there features and applying them to various machine learning i) CONVOLUTIONAL LAYER
algorithm use of deep learning is advantageous. For that, direct The main building block of the network is a convolutional
frequency spectrogram of audio is applied as an input, which layer in this basic convolutional operation was performed
effectively estimates high level features. on input in which filter is traverse on whole input. Two
In recapture audio detection, CNN gives good results. When convolutional layers were used in this network with 16
audio is recaptured there is variation in noise level and ENF and 32 numbers of the filter
present in audio [6], CNN is able to differentiate this structural ii) BATCH NORMALIZATION LAYER
difference in audio, which is unable to discern through the As input to convolutional layer is STFT matrix that’s
listing test, visual test [5]. why the size of the matrix will be very large so for
When inputs are highly correlated then CNN gives good normalization this layer is used. In which mean is sub-
result because it extracts dense feature i.e CNN is not only tracted from it and divided it by its standard deviation.
used as classifier but also used as feature extractor [14]. Scaling the activation function improves regularization
As CNN is booming in the image forensics because of its effect which reduce over fitting of kernel output. Batch
automatic feature extraction properties but for CNN requires normalization increases network complexity is but it gives
large database, now days due to use of multimedia we able better performance and speeds up the processing along
to get huge database, the same thing we will use in the audio with that achieves the same accuracy which less number
forensics. of epochs and high learning rate.
III. PROPOSED WORK iii) ACTIVATION FUNCTION
The activation function is like transfer function of CNN.
A. SPECTROGRAM ANALYSIS AND PREPROCESSING
It gives output in the range of -1 to 1 or 0 to 1. Rectified
As input to CNN requires in the form of an image so prepro- Linear Unit (ReLU) converts the input to output by the
cessing of an audio is required. In preprocessing spectrogram piecewise linear activation function, it’s linear to half of
of an audio is calculated and applied as input to CNN. As in the input function whose value is greater than zero and
audio, frequency varies continuously, when frequency varies non-linear to half whose value is less than zero.
overtimes at that time short time frequency transform (STFT) iv) MAX POOLING LAYER
gives significance in time and frequency domain, it gives time For reduction in dimensionality basically, polling layer is
localized frequency information of the signal which shows in used, in this input is divided in non-overlapping frames
fig1 . It is represented as a matrix where row associated as of region R and operation takes maximum value among
frequency and column associated as time. The magnitude of that frame. Because of pre-filtering on input spectral rep-
this matrix is spectrogram and spectrogram was treated as a resentation become sparse, so max pooling layer reduces
mono-channel image to the input of CNN. the amount of data and gives robust feature.
v) FULLY CONNECTED LAYER
The function of a fully connected layer is to classify
an input image into its respective class by using the
training database. For classification, fully connected layer
combines feature extracted from the lower layer in an
abstracted form in this softmax layer is used as an
activation function which outputs probability distribution
function whose output value is sum to 1. Likewise, CNN
transfer input from the first layer to the last layer with
adjustment of weights and bias and gives result as original
Figure 1. Spectrogram of original audio or forged.

10th ICCCNT 2019

July 6-8, 2019 , IIT - Kanpur,
Kanpur, India
IEEE - 45670

Figure 2. CNN Architecture

IV. EXPERIMENTAL SETUP Some random digit from any other three speakers is inserted
in the middle of audio after 5 digits.
The complete process is divided into three steps, the pro-
First of all only 1 digit is inserted which almost of 1sec
posed procedure is shown in Fig. 3
and then 2 digits and 3 digits of another speaker is inserted
• For preprocessing, an audio is divided into frames using
into original audio likewise three types of database is created.
different frame size and then applied to the Hamming Spliced audio length varies from 10 to 13 sec.
window which gives less spectral leakage after that FFT For CNN randomly 4000 original and 4400 spliced audios
is applied to each frame. The absolute value of each frame are selected out of which 70% are used for training and 30%
is calculated after FFT for spectrogram which is input to are used for testing. In order to extract spectrogram of audio
CNN. first windowing is performed on audio with different duration
• After application of input main thing is the extraction
and different overlapping for result comparison.
of the feature using CNN and train CNN for further
classification using all CNN layers. B. B. EVALUATION
• The last stage is classification which is done by the
The evaluation is performed on 30% dataset which is used
softmax layer. for testing. The proposed scheme is evaluated in terms of
accuracy percentage and error rate, where accuracy percentage
is the ratio of a total number of correctly classified output to
the total number of samples in test dataset. Error rate is the
ratio of the total number of misclassified outputs to the total
number of samples in the test dataset.

Correctly Classif ied Samples

Accuracy = × 100 (1)
T otal N umber of Samples

M isclassif ied Samples

Figure 3. Block Diagram of Proposed Work Error Rate = × 100 (2)
T otal N umber of Samples

A. DATABASE
In this paper, utterance of digits of four speakers of having
low, medium, and high pitch database is used. These four
speaker dataset is available on free spoken digit database
(FDSS). Available database consists of individual digits 0, 1, 2
up to 9. For original database, digits concatenated with some
silence part in between two digits of an individual speaker. So
original audios contains utterance of 0 to 9 spoken digits of
each speaker having 9 to 10 sec time.
As four speaker digit utterance database is available, in Figure 4. Accuracy of all slice database with 20ms frame and 50%
order to make dataset for splicing, digits of one speaker is overlapping
inserted into another speaker original utterance digits database.

10th ICCCNT 2019

July 6-8, 2019 , IIT - Kanpur,
Kanpur, India
IEEE - 45670

Figure 5. Accuracy comparison with a database by changing frame size and overlapping

Accuracy percentage of slice_1, slic2_2, and slice_3 of

having 20ms frame with 50% overlapping are shown in Fig.
4. According to result when a large part is inserted then CNN
is able to differentiate original and spliced part.

C. RESULTS
In STFT there is trade off in time and frequency resolution
when frame duration varies, it gives better resolution in the
time domain when the width of the window is a narrow and
better resolution in frequency domain when the width of the Figure 6. Performance of method with noise attack, compression and silence
window is broad. So the comparison is performed on the remove
dataset using different frame duration of 20ms, 30ms and
50ms along with 50% overlapping, 25% overlapping and no
overlapping for significance of STFT which is shown in Fig In the past when audio is compressed then it is difficult to
5. detect tampering but due to CNN compression does not affect
According to results for slice_1, slice_2 and slice_3 there accuracy. Dynamic range compressor is used, which attenuates
is no significant variation in accuracy percentage in all spliced the loud sound.
database when frame duration is of 20ms with 50% overlap-
V. CONCLUSION
ping. So for performance check with noise attack, compression
and silence remove at that time 20ms frame with 50% over- According to our knowledge, this is the first paper which
lapping is used. As recordings are anechoic so white Gaussian employ convolutional neural network for audio splicing de-
noise is convolved with created database of having the signal tection. This model able to extract high-level features from
to noise ratio of 3dB so results show that there no much the spectrogram of audio which acts as an image to the input
variation in accuracy percentage. of convolutional neural network and classify by its own, so
Presence of silence in audio increases its length, this for forensic authentication, the proposed method can be ef-
database is anechoic so that silence part does not contain any fectively used which directly detect tampering. We are able to
information, processing on silence part is waste of time, for demonstrate audio splicing detection with higher accuracy and
performance check silence part is remove from an audio for robustness to noise attack and compression. Though training
that threshold is applied for silence remove so that value below requires a huge database and more power for computation,
that threshold will be considered as silence part but because this approach is useful for audio forensics. As the database
of silence remove accuracy decreases to 81.95% for a slice_3 contains four speakers and splicing is performed with different
part. The reason behind the decrease in accuracy is sometimes speaker digit insertion, so the goal is to detect splicing when
the pitch is low at that that is considered as silence part so it insertion can be performed by recapture audio or by recording
decreases the performance. same speaker audio at different environment. Limitation of this

10th ICCCNT 2019

July 6-8, 2019 , IIT - Kanpur,
Kanpur, India
IEEE - 45670

paper is, as CNN extract direct features from spectrogram of AES International Conference on Audio Forensics. Au-
audio, we are not able so define significance of that features. dio Engineering Society. 2017.
[13] Michele Valenti et al. “A convolutional neural network
R EFERENCES
approach for acoustic scene classification”. In: 2017
[1] Daniel Patricio Nicolalde and Jose Antonio Apolinario. International Joint Conference on Neural Networks
“Evaluating digital audio authenticity with spectral dis- (IJCNN). IEEE. 2017, pp. 1547–1554.
tances and ENF phase change”. In: 2009 IEEE Inter- [14] Michele Valenti et al. “A convolutional neural network
national Conference on Acoustics, Speech and Signal approach for acoustic scene classification”. In: 2017
Processing. IEEE. 2009, pp. 1417–1420. International Joint Conference on Neural Networks
[2] Swati Gupta, Seongho Cho, and C-C Jay Kuo. “Current (IJCNN). IEEE. 2017, pp. 1547–1554.
developments and future trends in audio authentication”. [15] Hong Zhao et al. “Audio splicing detection and local-
In: IEEE MultiMedia 19.1 (2011), pp. 50–59. ization using environmental signature”. In: Multimedia
[3] Xunyu Pan, Xing Zhang, and Siwei Lyu. “Detecting Tools and Applications 76.12 (2017), pp. 13897–13927.
splicing in digital audios using local noise level esti-
mation”. In: 2012 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP).
IEEE. 2012, pp. 1841–1844.
[4] Hafiz Malik. “Acoustic environment identification and
its applications to audio forensics”. In: IEEE Trans-
actions on Information Forensics and Security 8.11
(2013), pp. 1827–1837.
[5] Da Luo, Haojun Wu, and Jiwu Huang. “Audio recap-
ture detection using deep learning”. In: 2015 IEEE
China summit and international conference on signal
and information processing (ChinaSIP). IEEE. 2015,
pp. 478–482.
[6] Xiaodan Lin, Jingxian Liu, and Xiangui Kang. “Au-
dio recapture detection with convolutional neural net-
works”. In: IEEE Transactions on Multimedia 18.8
(2016), pp. 1480–1487.
[7] Hong Zhao et al. “Anti-Forensics of Environmental-
Signature-Based Audio Splicing Detection and Its
Countermeasure via Rich-Features Classification”. In:
IEEE Transactions on Information Forensics and Secu-
rity 11.7 (2016), pp. 1603–1617.
[8] Z. Ali, M. Imran, and M. Alsulaiman. “An Automatic
Digital Audio Authentication/Forensics System”. In:
IEEE Access 5 (2017), pp. 2994–3007. ISSN: 2169-
3536. DOI: 10.1109/ACCESS.2017.2672681.
[9] M. Imran et al. “Blind Detection of Copy-Move Forgery
in Digital Audio Forensics”. In: IEEE Access 5 (2017),
pp. 12843–12855. ISSN: 2169-3536. DOI: 10 . 1109 /
ACCESS.2017.2717842.
[10] Miloš Marković and Jürgen Geiger. “Reverberation-
based feature extraction for acoustic scene classifi-
cation”. In: 2017 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP).
IEEE. 2017, pp. 781–785.
[11] Prateek Murgai, Mark Rau, and Jean-Marc Jot. “Blind
estimation of the reverberation fingerprint of unknown
acoustic environments”. In: Audio Engineering Society
Convention 143. Audio Engineering Society. 2017.
[12] Rashmika Patole, Gunda Kore, and Priti Rege. “Re-
verberation based tampering detection in audio record-
ings”. In: Audio Engineering Society Conference: 2017

10th ICCCNT 2019

July 6-8, 2019 , IIT - Kanpur,
Kanpur, India

Smaartv7 UserGuide
100% (1)
Smaartv7 UserGuide
199 pages
Loudspeaker Management System User Manual: XILICA Audio Design
No ratings yet
Loudspeaker Management System User Manual: XILICA Audio Design
24 pages
Water Marking Audio Files With Copyrights
No ratings yet
Water Marking Audio Files With Copyrights
83 pages
Project Work Phase-1-Review-1
No ratings yet
Project Work Phase-1-Review-1
15 pages
Gps RF Front-End Considerations: Component Selection Guide
No ratings yet
Gps RF Front-End Considerations: Component Selection Guide
102 pages
02484000
No ratings yet
02484000
140 pages
Audio DeepFake Detection (Innovative)
100% (1)
Audio DeepFake Detection (Innovative)
16 pages
Lecture11 PDF
No ratings yet
Lecture11 PDF
30 pages
Nidhi Chakravarty Mohit Dua: A Lightweight Feature Extraction Technique For Deepfake Audio Detection
No ratings yet
Nidhi Chakravarty Mohit Dua: A Lightweight Feature Extraction Technique For Deepfake Audio Detection
25 pages
Qx1622usb Manual
No ratings yet
Qx1622usb Manual
26 pages
Fake Audio Detection
100% (2)
Fake Audio Detection
14 pages
Signals and Systems - Labmanual - Matlab PDF
No ratings yet
Signals and Systems - Labmanual - Matlab PDF
25 pages
Deepfake Audio Detection Using MFCC Features: Priya N V, Pavan H, Prajwal S, Varun R Vinay A
100% (1)
Deepfake Audio Detection Using MFCC Features: Priya N V, Pavan H, Prajwal S, Varun R Vinay A
11 pages
Signals and Systems by K. Deergha Rao
100% (5)
Signals and Systems by K. Deergha Rao
434 pages
Ch08 - Applications To Filters and Equalizers PDF
No ratings yet
Ch08 - Applications To Filters and Equalizers PDF
98 pages
Ieee Audio Copy Forgery
No ratings yet
Ieee Audio Copy Forgery
14 pages
Coping With Threats Towards Speaker Recognition Systems, Spoofing Risk Minimization
100% (1)
Coping With Threats Towards Speaker Recognition Systems, Spoofing Risk Minimization
22 pages
Mel Spectrogram Based Audio Forgery Detection
No ratings yet
Mel Spectrogram Based Audio Forgery Detection
24 pages
Ads 127 L 01
No ratings yet
Ads 127 L 01
89 pages
Principles of Electronic Communication Systems 4th Edition Frenzel Test Bank
100% (41)
Principles of Electronic Communication Systems 4th Edition Frenzel Test Bank
12 pages
Reis Et Al. - 2017 - ESPRIT-Hilbert-Based Audio Tampering Detection Wit
No ratings yet
Reis Et Al. - 2017 - ESPRIT-Hilbert-Based Audio Tampering Detection Wit
12 pages
400 Pro
No ratings yet
400 Pro
36 pages
Channel Response Based Multi-Feature Audio Splicing Forgery Detection and Localization
No ratings yet
Channel Response Based Multi-Feature Audio Splicing Forgery Detection and Localization
8 pages
Protecting Your Voice: Temporal-Aware Robust Watermarking: Yue Li, Weizhi Liu, and Dongdong Lin
No ratings yet
Protecting Your Voice: Temporal-Aware Robust Watermarking: Yue Li, Weizhi Liu, and Dongdong Lin
6 pages
DIP UNIT 3 (Segment, Compress)
No ratings yet
DIP UNIT 3 (Segment, Compress)
42 pages
Detection of Impostor and Tampered Segments in Audio by Using An Intelligent System
No ratings yet
Detection of Impostor and Tampered Segments in Audio by Using An Intelligent System
14 pages
Digital Signal Processing Prof. T. K. Basu Department of Electrical Engineering Indian Institute of Technology, Kharagpur
No ratings yet
Digital Signal Processing Prof. T. K. Basu Department of Electrical Engineering Indian Institute of Technology, Kharagpur
26 pages
Audio Deepfake Detection Paper
100% (1)
Audio Deepfake Detection Paper
6 pages
Information 12 00263 v2
No ratings yet
Information 12 00263 v2
15 pages
Contributions of Jitter and Shimmer in The Voice F
No ratings yet
Contributions of Jitter and Shimmer in The Voice F
11 pages
Ramesh
No ratings yet
Ramesh
13 pages
ADCDAC
No ratings yet
ADCDAC
180 pages
Microwave Engineering Microwave Networks What Are Microwaves 589
100% (1)
Microwave Engineering Microwave Networks What Are Microwaves 589
26 pages
DSAP - Engineering Questions
No ratings yet
DSAP - Engineering Questions
2 pages
Echoprint: Two-Factor Authentication Using Acoustics and Vision On Smartphones
No ratings yet
Echoprint: Two-Factor Authentication Using Acoustics and Vision On Smartphones
16 pages
Last
No ratings yet
Last
20 pages
Repetition Code
No ratings yet
Repetition Code
12 pages
Verma CNN-based System For Speaker Independent Cell-Phone Identification From Recorded Audio CVPRW 2019 Paper
No ratings yet
Verma CNN-based System For Speaker Independent Cell-Phone Identification From Recorded Audio CVPRW 2019 Paper
9 pages
Final
No ratings yet
Final
19 pages
Vericast
No ratings yet
Vericast
68 pages
Audio DeepFake Detection (Innovative)
No ratings yet
Audio DeepFake Detection (Innovative)
16 pages
Pulse Code Modulation
No ratings yet
Pulse Code Modulation
7 pages
EXTC Advanced Digital Signal Processing
No ratings yet
EXTC Advanced Digital Signal Processing
7 pages
Audio Postprocessing Detection Based On Amplitude Cooccurrence Vector Feature
No ratings yet
Audio Postprocessing Detection Based On Amplitude Cooccurrence Vector Feature
12 pages
Final PPT-1
No ratings yet
Final PPT-1
60 pages
BTP Report
No ratings yet
BTP Report
39 pages
Audio Watermarking of Binaural Room Impulse Responses
No ratings yet
Audio Watermarking of Binaural Room Impulse Responses
8 pages
Supervised Audio Tampering Detection Using An Autoregressive Model
No ratings yet
Supervised Audio Tampering Detection Using An Autoregressive Model
5 pages
RED600.1/RED1200.1/REA600.4 User Manual: Car Audio Amplifier
No ratings yet
RED600.1/RED1200.1/REA600.4 User Manual: Car Audio Amplifier
8 pages
Croosover, Installation, Tweaking
No ratings yet
Croosover, Installation, Tweaking
9 pages
Base Paper 1 (Hybrid Approach)
No ratings yet
Base Paper 1 (Hybrid Approach)
6 pages
Applsci 12 11109
No ratings yet
Applsci 12 11109
13 pages
Main Report Draft UNTOCHED
No ratings yet
Main Report Draft UNTOCHED
82 pages
Electronics 14 02040
No ratings yet
Electronics 14 02040
13 pages
Shanthi Pavan - Chopped N-Path CT DSM With FIR - TCAS 2018
No ratings yet
Shanthi Pavan - Chopped N-Path CT DSM With FIR - TCAS 2018
5 pages
Band Energy Difference For Source Attribution in Audio Forensics
No ratings yet
Band Energy Difference For Source Attribution in Audio Forensics
11 pages
Fingerprint
No ratings yet
Fingerprint
17 pages
Audio - Deepfake - Detection - Using - Deep - Learning Paper2
No ratings yet
Audio - Deepfake - Detection - Using - Deep - Learning Paper2
6 pages
CoimbatoreICICCT (2018)
No ratings yet
CoimbatoreICICCT (2018)
5 pages
An Improved Feature Extraction For Hindi Language Audio Impersonation Attack Detection
No ratings yet
An Improved Feature Extraction For Hindi Language Audio Impersonation Attack Detection
26 pages
Instantaneous Frequency Estimation and Localization For Enf Signals
No ratings yet
Instantaneous Frequency Estimation and Localization For Enf Signals
10 pages
Final Deepfake Voice Detection Report
No ratings yet
Final Deepfake Voice Detection Report
36 pages
Undersampling - Experiment
No ratings yet
Undersampling - Experiment
8 pages
Mohini Dey - Capstone
No ratings yet
Mohini Dey - Capstone
52 pages
BTP Final
No ratings yet
BTP Final
16 pages
Cell-Phone Identification From Recompressed Audio Recordings
No ratings yet
Cell-Phone Identification From Recompressed Audio Recordings
6 pages
Audio Fingerprinting Based On Normalized Spectral Subband Moments
No ratings yet
Audio Fingerprinting Based On Normalized Spectral Subband Moments
4 pages
Betray Oneself: A Novel Audio Deepfake Detection Model Via Mono-To-Stereo Conversion
No ratings yet
Betray Oneself: A Novel Audio Deepfake Detection Model Via Mono-To-Stereo Conversion
5 pages
ELEC3104 Summer 2009-2010
No ratings yet
ELEC3104 Summer 2009-2010
9 pages
RBPRATYUSH448
No ratings yet
RBPRATYUSH448
20 pages
OWX50 Spec Sheet 1.3
No ratings yet
OWX50 Spec Sheet 1.3
2 pages
Computers 13 00256
No ratings yet
Computers 13 00256
13 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
Implementation Paper
No ratings yet
Implementation Paper
13 pages
Audio Deepfake (Camera Ready Paper)
No ratings yet
Audio Deepfake (Camera Ready Paper)
13 pages
1 s2.0 S0950705125007725 Main
No ratings yet
1 s2.0 S0950705125007725 Main
15 pages
Detection of Synthetically Generated Speech
No ratings yet
Detection of Synthetically Generated Speech
5 pages
AI Audio Deepfake
No ratings yet
AI Audio Deepfake
18 pages
IJISAE 3 Dr.+Shwetambari+Borade 3 1899
No ratings yet
IJISAE 3 Dr.+Shwetambari+Borade 3 1899
8 pages
Detection of Fake AudioA Deep
No ratings yet
Detection of Fake AudioA Deep
11 pages
Report
No ratings yet
Report
7 pages
FP10000Q FP6000Q FP2400Q
No ratings yet
FP10000Q FP6000Q FP2400Q
4 pages
Wang 2010
No ratings yet
Wang 2010
6 pages
Deepfake Basepaper
No ratings yet
Deepfake Basepaper
3 pages
Applsci 13 08488 v2
No ratings yet
Applsci 13 08488 v2
15 pages
A Robust Audio Deepfake Detection System Via Multi-View Feature
No ratings yet
A Robust Audio Deepfake Detection System Via Multi-View Feature
5 pages
A Deep Learning Framework For Audio Deepfake Detection
No ratings yet
A Deep Learning Framework For Audio Deepfake Detection
12 pages
A Hybrid CNN-LSTM Approach For Deepfake Audio Detection CRC FINAL
No ratings yet
A Hybrid CNN-LSTM Approach For Deepfake Audio Detection CRC FINAL
6 pages
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet

Audio Splicing Detection Using Convolutional Neural Network

Uploaded by

Audio Splicing Detection Using Convolutional Neural Network

Uploaded by

IEEE - 45670

Audio Splicing Detection using Convolutional

10th ICCCNT 2019

10th ICCCNT 2019

Figure 2. CNN Architecture

Correctly Classif ied Samples

M isclassif ied Samples

10th ICCCNT 2019

Accuracy percentage of slice_1, slic2_2, and slice_3 of

10th ICCCNT 2019

10th ICCCNT 2019

You might also like