0% found this document useful (0 votes)
17 views83 pages

Water Marking Audio Files With Copyrights

The thesis by Farida Aboelezz addresses the issue of copyright infringement in audio files through the use of digital watermarking as a protective measure. It provides a comprehensive overview of existing audio watermarking techniques and proposes a new scheme that combines cryptography and steganography for enhanced security. The performance of the proposed scheme is evaluated and compared with other established methods in the field.

Uploaded by

kkakaei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views83 pages

Water Marking Audio Files With Copyrights

The thesis by Farida Aboelezz addresses the issue of copyright infringement in audio files through the use of digital watermarking as a protective measure. It provides a comprehensive overview of existing audio watermarking techniques and proposes a new scheme that combines cryptography and steganography for enhanced security. The performance of the proposed scheme is evaluated and compared with other established methods in the field.

Uploaded by

kkakaei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/361231013

Watermarking Audio Files with Copyrights

Thesis · June 2022

CITATION READS

1 232

1 author:

Farida Aboelezz
The German University in Cairo
1 PUBLICATION 1 CITATION

SEE PROFILE

All content following this page was uploaded by Farida Aboelezz on 11 June 2022.

The user has requested enhancement of the downloaded file.


Faculty of Management Technology
Business Informatics Department
The German University in Cairo

Watermarking Audio Files with


Copyrights

Bachelor Thesis

Author: Farida Aboelezz


Supervisor: Dr. Wassim Alexan
Submission date: June 6th , 2022
This is to certify that:

(i) the thesis comprises only my original work,

(ii) due acknowledgement has been made in the text to all other material used.

–––––––––––––––––––––
Farida Aboelezz
June 6th , 2022

ii
Abstract

The internet is a host for billions of pirated audio files, including songs, audio
books, podcasts, and voice recordings. People who illegally upload audio files on
the internet disregard all copyright laws. Digital watermarking is a widely used
technology for copyright protection and content authentication. This thesis tries
to solve the business problem of the need to protect audio files from copyright
infringement through audio watermarking. An overview of audio files and digital
watermarking is presented and the existing literature for audio watermarking tech-
niques are discussed and contrasted. An audio watermarking scheme is proposed
and its performance is measured and compared with other schemes previously pro-
posed and implemented.

iii
Acknowledgement

I express my sincere gratitude to my supervisor, Dr. Wassim Alexan, for his


continuous help, guidance and support throughout this journey.

–––––––––––––––––––––
Farida Aboelezz
June 6th , 2022

iv
Contents

1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background and Literature Review 4


2.1 Copyright Infringement . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Audio Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Audio Watermarking . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Definition and History . . . . . . . . . . . . . . . . . . . . . 10
2.3.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.3 Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.4 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Performance Evaluation Metrics . . . . . . . . . . . . . . . . . . . . 20
2.4.1 Hearing Test . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.2 Waveform Plots . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.3 Mean Squared Error . . . . . . . . . . . . . . . . . . . . . . 21
2.4.4 Peak Signal–to–Noise Ratio . . . . . . . . . . . . . . . . . . 22
2.4.5 Intensity Properties . . . . . . . . . . . . . . . . . . . . . . . 22

v
2.4.6 Time–Domain Properties . . . . . . . . . . . . . . . . . . . . 22
2.4.7 Frequency–Domain Properties . . . . . . . . . . . . . . . . . 22
2.4.8 Basic Histogram Properties . . . . . . . . . . . . . . . . . . 23
2.4.9 Information Entropy . . . . . . . . . . . . . . . . . . . . . . 23
2.5 State–of–the–Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Technology Comparison . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Watermarking Audio Files with Copyrights 37


3.1 Proposed Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1.1 Data Encryption and Embedding . . . . . . . . . . . . . . . 40
3.1.2 Data Extraction and Decryption . . . . . . . . . . . . . . . . 41
3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.1 Encryption and Embedding . . . . . . . . . . . . . . . . . . 42
3.2.2 Extraction and Decryption . . . . . . . . . . . . . . . . . . . 55
3.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4 Conclusions 64

References 66

vi
List of Figures

2.1 Audio signals conversion [8]. . . . . . . . . . . . . . . . . . . . . . . 8


2.2 Process of digital watermarking [13]. . . . . . . . . . . . . . . . . . 11
2.3 Process of digital audio watermarking [16]. . . . . . . . . . . . . . . 12

3.1 The proposed watermarking scheme for audio files. . . . . . . . . . 39


3.2 The resulting sequences from the 2D Tan Logistic Map. . . . . . . . 39
3.3 Waveform plot of the left and right channels of the cover audio file. 59
3.4 Waveform plot of the left and right channels of the watermarked
audio file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

vii
List of Tables

2.1 Key requirements for each watermarking application. . . . . . . . . 18

3.1 MSE and PSNR values employing the proposed algorithm. . . . . . 60


3.2 Intensity properties. . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.3 Time-domain properties. . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4 Frequency-domain properties. . . . . . . . . . . . . . . . . . . . . . 61
3.5 Basic histogram properties. . . . . . . . . . . . . . . . . . . . . . . 62
3.6 Information entropy values. . . . . . . . . . . . . . . . . . . . . . . 62
3.7 A comparison between the PSNR value obtained from the proposed
scheme with the PSNR values obtained from other schemes in the
literature.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

viii
List of Acronyms and Initialisms

2-D Two-Dimensional
ADC Analog-to-Digital Converter
BPS Bits Per Second
CC Creative Commons
CPTWG Copy Protection Technical Working Group
CF Crest Factor
DAC Digital-to-Analog Converter
DB Decibel
DCT Discrete Cosine Transform
DFT Discrete Fourier Transform
DFRST Discrete Fractional Sine Transform
DST Discrete Sine Transform
DWT Discrete Wavelet Transform
FFT Fast Fourier Transform
HAS Human Auditory System
IDFT Inverse Discrete Fourier Transform
IMF Intrinsic Mode Function
ISO International Organization for Standardization
KNN K-Nearest Neighbor
LSB Least Significant Bit
LWT Lifting Wavelet Transform
MPEG Moving Picture Experts Group
MSE Mean Squared Error

ix
ODG Objective Difference Grade
PAPR Peak–to–Average Power Ratio
PSNR Peak Signal to Noise Ratio
P2P Peer-to-Peer
QIM Quantization Index Modulation
RMS Root Mean Square
SDMI Secure Digital Music Initiative
SIM Similarity
SNR Signal-to-Noise Ratio
SSIM Structural Similarity Index Measure
SD Standard Deviation
SV Singular Value
SVD Singular Value Decomposition
TC Temporal Centroid
ZC Zero Crossings

x
Chapter 1

Introduction

In this chapter, I provide a brief introduction to this thesis by explaining the moti-
vation behind it, the business problem that it attempts to solve, my contributions
as well as an explanation to how the paper is organized.

1.1 Motivation

The online world has given rise to new innovative ways to share art and create
digital works. The invention of online streaming has created a multi-billion dollar
industry that changed the way people listen to audio files. However, it faces a
major dilemma, which is online piracy. Online piracy is the illegal copying and
distribution of copyrighted content for which the owners of the work did not give
consent to. The online world and audio streaming websites have a massive world-
wide user base since they are easily accessible, which renders the protection of au-
dio files from any form of copyright infringement extremely difficult to accomplish.

1
This problem remains on the rise with very limited research in this area at-
tempting to approach the issue using digital watermarking. A digital watermark
is any sort of marker embedded in an audio, video, text or image. Through audio
watermarking, creators of content would be able to prove ownership of their con-
tent and protect their copyrights.

The motivation behind this thesis is to combat the copyright infringement of


audio files in the online world using digital audio watermarking. Through the use
of digital watermarking, a means for creators is provided to prove ownership of
their original works and prevent content theft and unauthorized use. Therefore, I
propose a digital audio watermarking scheme for copyright protection of audio files.

1.2 Contributions

In this thesis, I explain the business problem of copyright infringement and the
concept of digital watermarking in combating this problem. I also provide an ex-
tensive literature review of 30 previously implemented audio watermarking schemes
and compare them with one another. I propose a double layer message security
scheme with cryptography in the first layer and steganography in the second layer
implemented on Wolfram Mathematica® . The proposed scheme’s performance is
evaluated and compared with other similar schemes previously implemented.

2
1.3 Organization

This thesis consists of 4 chapters. Chapter 1 provides an introduction to this pa-


per, discussing the motivation behind it, its contributions, and how it is organized.
Chapter 2 provides an overview of copyright infringement, digital audio files, and
digital watermarking. Moreover, I extensively review the existing literature for
digital audio steganography and digital audio watermarking and compare existing
schemes and techniques with one another. Finally, I give a brief overview of the
technologies that can be used to implement my own audio watermarking scheme.
Chapter 3 explains the proposed scheme, how it is implemented, performance eval-
uation results, as well as comparisons with other schemes in the literature. Chapter
4 provides a conclusion of this thesis.

3
Chapter 2

Background and Literature


Review

In this chapter, I provide an overview of the business problem I am trying to


solve, which is, the infringement of copyrights in the online world. The chapter
also includes a brief definition of audio files and an extensive discussion of audio
watermarking. This encompasses the history of watermarking, its applications, it
properties, and associated attacks. Some of the performance evaluation metrics
of watermarking schemes are discussed and explained. Moreover, I review the ex-
isting literature and compare the performance of different watermarking schemes.
Finally, I give a brief overview of different technologies that can be used in the
implementation of an audio watermarking scheme for copyright protection.

4
2.1 Copyright Infringement

Copyright infringement occurs on a daily basis. Statistically speaking, everyone


has probably infringed on someone’s copyrights at some point in their life. That is
why copyright laws exist to protect original creative works, such as art, literature,
and music that are fixed in some kind of tangible medium. The medium can be
a book, a canvas, official publication or even a digital medium. These laws give
creators exclusive rights to their works.

As a creator, one has the right to distribute, modify, perform, and display their
works as they prefer. However, if someone else takes their work, or does any of
these actions, then they will be participating in copyright infringement. Accord-
ing to the US Copyright office, “copyright infringement occurs when a copyrighted
work is reproduced, distributed, performed, publicly displayed, or made into a
derivative work without the permission of the copyright owner”. Moreover, if
someone knowingly induced or helped someone else in doing these actions, they
will also be guilty of contributory infringement. Copyright laws provide severe
civil and criminal penalties for copyright infringement [1, 2, 3].

The only exceptions to copyright infringement laws include fair use, public do-
main, creative commons, and direct licensing. Fair use is when copyrighted works
may be used without permission or payment for a limited purpose, such as educa-
tional use, to comment upon, criticize or for a parody. For example, if you write an
article placing a paragraph of a text of an author you are criticizing, this is consid-
ered fair use since it is for the purpose of criticism. Copyrights last for the lifetime
of their owner and for 70 years after, so, when the term of copyrighted works is

5
over, the works are placed in the public domain and may be used by anyone in any
way. Creative Commons (CC), an internationally active non-profit organization,
provides creators with licenses that allow them to communicate which rights they
are reserving for themselves and which rights they give permission for others to use
their work. Finally, the finest way is to directly seek a license to use copyrighted
works before one intends to use it, which results in obtaining a direct license if the
creator agrees [1, 2, 3].

One type of copyright infringement is online piracy which continues to evolve


as technology changes. Online piracy is the unauthorized use or reproduction of
copyrighted digital content. Piracy platforms illegally provide digital content to
internet users, either free or with a fee, after obtaining it without permission from
the owners of the works. Examples of such platforms include streaming websites,
peer-to-peer (P2P) file sharing, stream-ripping software, Napster and mobile ap-
plications. A person may not even be aware that they are taking part in online
piracy by downloading unauthorized copies of copyrighted works without permis-
sion or even streaming the content from a website that that is involved in online
piracy [4, 5, 6].

The significance of such type of copyright infringement lies in the negative im-
pact it has on different industries. Taking the music industry as an example, it
has lost billions of dollars because of music piracy. While downloading one song
may not feel like a very serious crime, the accumulative impact of downloading
and streaming millions of songs can be devastating. When a digital work is pirated
and distributed illegally online, the creator of such work is not compensated. One

6
credible study by the Institute for Policy Innovation reports the annual harm of
music piracy to be 12.5 billion dollars in losses to the U.S. economy as well as more
than 71,000 lost jobs and 2.7 billion dollars in lost wages to American workers [5,
7, 6].

Nowadays, it is becoming increasingly difficult for people to protect their origi-


nal works and its authenticity. There will always be someone who will try to steal
someone else’s work for their personal gain. Fortunately, with patents and other
copyright protection techniques, one can protect their creative works. However,
protecting audio, is more difficult than other content created and very little re-
search has attempted to solve this issue. However, audio watermarking has made
it possible to protect audio files from copyright infringement. Therefore, in this
thesis, I explain how digital watermarking can be used and implement an audio
watermarking scheme for this purpose.

In the following sections, I briefly define what an audio file is and give an overall
view of audio watermarking and some performance evaluation metrics of water-
marking schemes. I also review the existing literature for audio watermarking and
audio steganography. Finally, I suggest different technologies that can be used to
implement an audio watermarking scheme.

2.2 Audio Files

Before discussing audio watermarking, this section defines what an audio file is
and explains the basics of digital audio.

7
Figure 2.1: Audio signals conversion [8].

Audio signals are the representation of sound, which is in the form of digital
and analog signals. Analog signals occur in electrical signals, whereas digital sig-
nals occur in binary representations. Digital audio is a technology used to record,
store, manipulate, generate and reproduce sound using audio signals encoded in
digital form [8].

Figure 2.1 explains how an analog signal can be converted into a digital signal
for digital storage and how a digital signal can be transformed back to an analog
one for output.

A digital audio signal starts with an analog-to-digital converter (ADC) that


converts an analog signal to a digital signal. This is done by taking thousands of
samples per second from an analog audio source to ensure the replication of the

8
waveform. Each sample represents the intensity of the waveform in that instant.
The more samples taken, the better the representation, and the higher the qual-
ity of the digital audio becomes. The samples are stored in binary (a numbering
scheme in which there are only two possible values for each digit: 0 and 1), like
any digital data. The samples have to be merged into a single data file in a correct
format and the digital audio signal can then be transmitted or stored. Digital stor-
age can be on a CD, and MP3 player, a hard drive, USB flash drive, or any other
digital data storage device. There exist audio compression techniques to reduce
the file size for the digital audio signals to be easily streamed to other devices [8].

In order for the digital audio signals to be outputted and listened to, it needs to
be converted back to an analog signal done through a digital-to-analog converter
(DAC) that, like ADCs, run at a specific sampling rate and bit resolution [8].

The main advantage of digital audio over analog is that computers can ma-
nipulate numbers without any errors by performing calculations on the long list
of numbers. Computers are able to make perfect copies of the list of numbers
you want, thus your audio file. Not only that, but computers can also combine
different recordings together, by adding the numbers in one list to another list,
resulting in a third list of numbers which includes both sounds at the same time.
This is known as digital mixing. Moreover, computers can manipulate the streams
of numbers in a digital audio making it quieter or louder, adding effects, playing
recordings at different speeds, removing noise or echo, etc. [8].

In the following section, I provide a detailed overview of audio watermarking.

9
2.3 Audio Watermarking

Digital watermarking is the process of embedding a digital watermark into digital


media in order to protect this media. Multimedia that can be watermarked in-
cludes text, images, video, and audio content.

This section provides an overview of digital watermarking: its history, appli-


cations, possible attacks against it, and its properties.

2.3.1 Definition and History

Watermarking falls under the umbrella of information hiding. Information hiding


is an act of protecting information from change, in other words, hiding the details
of an object to protect it from change. The three main categories of information
hiding techniques are steganography, cryptography and watermarking [9].

From the point of view of information hiding, watermarking can be considered


a subdivision of steganography [10]. Steganography is a practice of hiding a secret
message or an object within another object while a watermark is a form of text
or image impressed onto another text or image to provide evidence of authenticity
[9]. Roughly speaking, if the hidden message or the object’s content is related to
the host, then it is considered a watermark [10]. The main differences are that the
amount of hidden information in watermarking is usually minimal and that there

10
Figure 2.2: Process of digital watermarking [13].

is some reliance on redundancy in watermarking to make the watermark more


robust compared to steganography [11]. Cryptography involves changing a mes-
sage, by using a secret key, into cipher text. Cipher text cannot be understood or
decoded by anyone except the intended receiver through the use of a secret key [9].

Digital watermarking extends the concept of watermarking into the digital


world where a digital signal is embedded directly into digital content becoming an
integral part of the data [9]. This embedded digital signal, the watermark, can
be described as an identifying code that may contain information about the host
object, its owner, its buyer, or other copyright information [11, 12]. Essentially,
the goal of a watermark is to always be present in the host data hiding proprietary
information [9].

The general process of digital watermarking is shown in Fig. 2.2. It starts by


selecting the digital content and the watermark and then the watermarking scheme
is used to embed the watermark into the digital data, resulting in watermarked
data. It may also include using a secret key. The same algorithm is applied in
reverse to extract the watermark when needed [12, 14, 15].

11
Figure 2.3: Process of digital audio watermarking [16].

How watermarking audio files works is demonstrated in Fig. 2.3. In the most
general form, audio watermarking hides a user-specified bit-stream (the water-
mark) in digital audio. The original audio and the bit-stream are both inputted
into the encoder with a secret key known only to the person creating the water-
mark, producing the watermarked digital audio. This key is used again with the
watermarked audio in order to extract and recover the watermark [16].

Watermarking is not a new phenomenon. In fact, the term is derived from


the history of traditional papermarking where wet fiber is pressed to expel the
water, and the contrast between the watermarked and non-watermarked areas of
the paper form a pattern and becomes visible [17]. Watermarking originated in
the late Middle Ages, dating all the way back to 1282. The earliest known usage of
paper watermarking was to identify the manufacturer of the paper, evolving later
to record details of itself in addition to the manufacturer [18, 17].

12
The next application of watermarking was introduced in the eighteenth century
to combat attempts of cash forgery. Later, in 1954, Emil Hem Brooke recorded his
musical-works patent which was the basis for the advancements that followed in
the technology of watermarking. Particularly after 1988, the term “computerized
watermarking” became known [18], although according to [11], the term “digital
watermarking” dates back to 1979.

The popularity of watermarking surged post 1990 around the world and many
associations began considering implementing it. Associations such as the Secure
Digital Music Initiative (SDMI) started a watermarking framework for music, the
Copy Protection Technical Working Group (CPTWG) also started to consider
watermarking its DVDs, and the International Organization for Standardization
(ISO) also showed interest in watermarking its MPEG [18, 17]. By 1998, digital
watermarking had become a well-established technology [11].

2.3.2 Applications

Digital watermarking can be used to achieve various objectives other than copy-
right protection including fingerprinting, content authentication, and broadcast
monitoring [19, 20]. In this section, I discuss the different applications of audio
watermarking.

Copyright protection is one of the primary and most important applications


of digital watermarking. The goal is to embed information, as a watermark, that

13
identifies the owner of the digital media in order to prevent others from claiming it
[19]. Hence, in cases of copyright disputes, ownership of digital media can be estab-
lished through the embedded watermark [14]. If the watermark includes copyright
regulations related to the fair use of content, then it also achieves the objective of
copy control. The objective behind copy control is to disable unauthorized par-
ties from modifying, copying, or redistributing the content without permission [12].

Fingerprinting is another method of copyright protection. The goal of finger-


printing is to convey information about the recipient of the digital content rather
than the originator to identify each distributed copy of digital work. The need
for fingerprinting arose when people would obtain a copy of a product legally and
then distributed it illegally [14, 19]. It works by embedding a different watermark
into each distributed copy of the digital media such that owners of the content
can trace any illegal duplication or distribution of their original work back to each
receiver [12, 14].

In addition to copyright protection, watermarking can also be used for digital


content authentication, also known as tamper detection. This application aims to
detect any changes of the content as a sign of invalid authentication [12, 19]. If
the watermark is deteriorated or destroyed, this means that the digital content has
been tampered with and therefore the digital content cannot be trusted [14]. For
this application, a fragile or semi-fragile watermark is used rather than a robust
one [19, 20].

In the realm of broadcasting, watermarks are used to enable automated moni-

14
toring of content broadcasted on different networks. This is referred to as broadcast
monitoring where an identification code can is embedded as a watermark in the
work being broadcast and a computer-based monitoring system can detect this
watermark, verifying that content is broadcasted or not. It is especially useful in
the entertainment industries to check whether content is broadcast according to
contracts with broadcasters [12, 14, 19].

2.3.3 Attacks

Even though watermarking can achieve the objectives of copyright protection and
content authentication, attackers may still attempt to manipulate the protected
media as well as the watermark. In this section, I define what an attack is in the
realm of audio watermarking and overview its different types and categories.

An attack against a watermarking system can be defined as any processing


which finds a way around the intended purpose of the watermarking technique
for a specific application [21]. According to this definition, audio watermarking
attacks include normal processing operations that can happen intentionally or un-
intentionally which lead to the destruction of the watermark or the degrading of
the quality of the watermark beyond an acceptable threshold making it unread-
able or undetectable [22, 14]. Therefore, a watermarking attack is successful if it
conquers the watermarking technique while ensuring the quality of the attacked
data [21].

Watermarking would be rendered useless during detection if the watermarks

15
cannot be detected, false watermarks are detected or unauthorized detection of
watermarks takes place. From this we can categorize different types of attacks
that are dependent on the knowledge of the attacker, the tools they have at their
disposal and the availability of watermarked version of the same or different works
[21].

Attacks which produce “no detection” of watermarks can be divided into re-
moval attacks, which erase the watermark from the watermarked data, and desyn-
chronization attacks, which misalign the watermark detector and the watermark
without the removal of the watermark information. Removal attacks can be a
result of normal signal processing operations in which attackers do not need any
special knowledge of the underlying algorithms or in signal processing such as noise
addition, resampling, filtering, echo addition and data compression [10]. Moreover,
specific designed attacks, in which the attacker has knowledge of the watermark
embedding mechanism, enable the attacker to design specific algorithms by finding
and exploiting their weaknesses. Collusion attacks are used when the attacker does
not have knowledge about the embedding mechanism but has access to more than
one watermarked works with the same watermark as long as the added watermark
signal is not a function of the original work. On the other hand, if the attacker
has no knowledge about the embedding algorithm and only one watermarked work
but has access to a watermark detector, they can apply oracle attacks [21].

Desynchronization attacks, instead or removing the watermark altogether, mis-


align the embedded watermark and corresponding watermark detector in way that
it becomes computationally infeasible to perform synchronization before detection.

16
As a consequence, the detector will not be able to detect the watermark. These
attacks consist of global and local transformations, as well as scrambling attacks.
Examples of these types of attacks are Random Samples Cropping and Zeros In-
serting, Jittering, Pitch-invariant Time Stretching, and Tempo-preserved Pitch
Shifting [10, 21].

Attacks which produce “false detection” of watermarks are embedding or am-


biguity attacks, which stimulate an embedded watermark. These include copy
attack which aims to copy a watermark from one work and embed it in another
work that should not contain this watermark, overmarking where a second water-
mark is embedded in an already watermarked signal, and deadlock attack [23, 21].

The opposite of unauthorized embedding is “unauthorized detection” attacks


which stimulate the detection of watermarks even if they were not inserted before
[21].

2.3.4 Properties

This section addresses the various requirements of a watermarking system, in light


of the applications and attacks discussed above.

There are multiple properties to look out for in a watermarking scheme. How-
ever, the basic and most important requirements of any watermarking scheme are
imperceptibility, robustness, security, data payload, and computation complexity
[10, 11, 14, 19]. There are trade-offs between the properties since many of them

17
Table 2.1: Key requirements for each watermarking application.
Watermarking application Prioritized parameter
Copyright protection High security and imperceptibility
Content description High data payload
Content authentication Low robustness
Real–Time watermarking Low computational complexity

are mutually exclusive [10]. The relative importance of each property depends on
the requirements of the system application [19]. Table 2.1 summarizes the key
requirements of each watermarking application.

The first property is imperceptibility, also known as fidelity. It refers to the


perceptual similarity between the original and the watermarked versions of the
cover work [19]. The watermark should be transparent, inaudible or impercepti-
ble, which means it cannot be detected except by an authorized person only [10,
14]. This property is considered the most important requirement in a watermark-
ing system for content or author validation and/or to achieve copy control [14].
However, in some watermarking applications, imperceptibility is compromised so
that higher robustness and/or lower computational cost are achieved since these
properties are mutually exclusive. The more imperceptible a watermark is, the
less robust it becomes, and the higher the computational cost [24].

Robustness refers to the ability of the watermark to be detected after normal


processing operations or different kinds of attacks [10, 11, 14, 19]. A robust water-
mark should be difficult or impossible to modify or remove by simple processing
techniques [14]. The need for a robust watermark depends on the nature of appli-

18
cation of a watermarking system [19]. In some instances, a fragile watermark is
favored over a robust one [11].

Closely related to robustness is security. Security refers to the ability of the


watermark to resist hostile attacks that attempt to hinder the watermark’s pur-
pose [19]. Attacks can be unauthorized removal, unauthorized embedding, and
unauthorized detection [10, 19]. A secure watermark cannot be removed without
full knowledge of the watermark and its embedding system which makes security
a key requirement for watermarking schemes used for copyright protection [14].

An important distinction between robustness and security is intention. Ro-


bustness refers to the ability of a watermark to withstand unintentional attacks in
the form of normal processing operations such as filtering, scaling, cropping, and
compression. On the other hand, security refers to the ability of a watermark to
withstand intentional attacks that actively try to remove, alter, or detect it. A
watermark that is both robust and secure should remain detectable after any type
of attack [11].

The property of data payload refers to the quantity of bits that a watermark
embeds per time interval. In other words, it describes how much data to embed
as a watermark for effective detection of the watermark [10, 14]. There exists a
trade-off between data payload and robustness where higher payloads result in
lower robustness and vice versa [10]. Different applications require different data
payload where copy control applications may need embedding of a few bits in cover
works [19].

19
The last property that I will be discussing in this section is computation com-
plexity, or cost, of a watermarking scheme. It is described as the effort and time
needed for embedding and detection of watermark [11, 19]. There is a direct rela-
tion between this complexity and the desired level of security. In other words, more
computation complexity is needed for strong security of a watermark. However, in
some cases where speed is favored over security, such as in real-time applications,
lower computational complexity is required [14].

As per the above discussion, there will always be some properties that are
prioritized and other compromised, depending on the desired application of the
watermarking scheme. Any watermarking scheme is designed with consideration
of these requirements which are optimized to achieve the goal of the scheme [14].

2.4 Performance Evaluation Metrics

Performance evaluation metrics are used to measure the performance, speed and
effectiveness of a watermarking scheme. Moreover, statistical tests are carried out
to make sure a watermarking scheme is robust against attacks. In this section, I
highlight some of the metrics and tests that I will be using to evaluate the perfor-
mance and security of my proposed watermarking scheme.

20
2.4.1 Hearing Test

The Human Auditory System (HAS) can be used for evaluating differences be-
tween a cover audio and a watermarked audio owing to its sensitivity. Therefore,
the simplest test to recognize if any alterations occurred to an audio file after
watermarking is attentively listening to both files. A watermarking scheme is ef-
fective if there are no audible differences after hearing both version of an audio file.

2.4.2 Waveform Plots

Waveform plots for a cover audio file and a watermarked audio file can be exam-
ined with the human eye to determine if there are any differences. If the plots are
identical through observation, this proves the effectiveness of the watermarking
scheme used.

2.4.3 Mean Squared Error

The Mean Squared Error (MSE) is calculated through comparing the samples of
a cover audio and a watermarked audio as follows
N
1 X
M SE = (ci − si )2 (2.1)
N i=1

where N is the number of samples, ci is the sample value of the cover audio and
si is the sample value of the watermarked audio. If the original audio file and the
watermarked audio file are totally identical, the MSE would be equal to zero.

21
2.4.4 Peak Signal–to–Noise Ratio

The Peak Signal to Noise Ratio (PSNR) is calculated in decibels (dB) by getting
the maximum sample value of a cover audio and dividing it by the MSE as follows
 I2 
max
P SN R = 10 × log 10 (2.2)
M SE

where I max is the maximum value of the samples in the cover audio.

2.4.5 Intensity Properties

Intensity properties of an audio file include power, root mean square (RMS) of
values, and loudness. Power is given in terms of the mean of the squared values
and the loudness is computed with Stevens’ power law.

2.4.6 Time–Domain Properties

Time-domain properties of an audio file include the crest factor (CF), the peak to
average power ratio (PAPR), the temporal centroid (TC) values, and the number
of zero crossings (ZC). The CF is calculated as the maximum divided by the RMS
and the PAPR is calculated as the maximum power divided by the average power.

2.4.7 Frequency–Domain Properties

Frequency-domain properties of an audio file include the spectral centroid, the


spectral crest, the spectral flatness, the spectral kurtosis, the spectral roll off, the
spectral skewness, and the spectral spread. The spectral crest is calculated as the

22
maximum power spectrum divided by the mean of the power spectrum. The spec-
tral flatness is calculated as the geometric mean divided by the mean of the power
spectrum. The spectral kurtosis is computed as the kurtosis of the magnitude
spectrum [25]. The frequency below which most of the energy is concentrated is
the spectral roll off. The spectral spread is a measure of the bandwidth of the
power spectrum.

2.4.8 Basic Histogram Properties

The basic histogram properties of an audio file include the maximum value, the
minimum value, and statistical measures. Statistical measures include the mean,
median and the standard deviation (SD) of the values.

2.4.9 Information Entropy

Information entropy is a measure of the amount of information in a file [26]. There-


fore, it is often the case that a cover audio file will produce a slightly lower entropy
than a watermarked version of it.

2.5 State–of–the–Art

This section provides an overview of the latest advancements in the fields of digital
audio watermarking and steganography. The existing literature is reviewed, and
the schemes proposed by different authors are compared. Digital steganography

23
and audio steganography are included in this discussion. Papers covering digital
watermarking and audio watermarking are also be reviewed.

In [27], a methodology for digital watermarking that can be generalized to


images, audio, video and multimedia data is proposed. In this algorithm, a water-
mark is constructed as an independent and identically distributed Gaussian ran-
dom vector that is imperceptibly inserted in a spread-spectrum-like fashion into
the perceptually most significant components of the data spectrum. The length of
the watermark can be adjusted depending on the data characteristics.

The authors of [28] propose an audio watermarking scheme based on Fast


Fourier Transform (FFT). The original audio is segmented into frames and water-
marks are embedded into the selected prominent peaks of the magnitude spectrum
of each frame. The scheme is tested against various kinds of attacks and showed
similar robustness compared to that in [27]. However, this scheme achieved Signal-
to-Noise (SNR) values ranging from 20 dB to 28 dB which are superior to the
scheme in [27] which achieved SNR values ranging from 14 dB to 28 dB.

Another scheme based on FFT is proposed in [29] which embeds watermarking


information into the phase coefficient of audio signal after FFT. The SNR, after
the watermark embedding, is 43.5 dB, a superior SNR than the scheme in [28],
which proves this algorithm to be inaudible and robust against noise. The exper-
imental results further show that the ability of this algorithm is stronger against
Gaussian noise and low-pass filter attack while its effect is poor against the re-
sampling attack.

24
In [30], the authors designed a new watermarking system using discrete Fourier
transform (DFT) where the audio file gets segmented into non-overlapping frames
and watermarks are then embedded into the highest peak in the magnitude spec-
trum of each frame. For this scheme, the Similarity (SIM) values range from 13
to 20 and the SNR values range from 20 dB to 28 dB for different watermarked
sounds, which is much lower compared to [29].

The authors of [31] proposed a watermarking method in which the audio file is
transformed into Discrete Cosine Transform (DCT) domain. The absolute values
of DCT coefficients are divided into an arbitrary number of segments and each
segment’s energy is calculated. Then, watermarks are embedded into the selected
peaks of the highest energy segment. Simulation results show that this method is
highly robust against different kinds of attacks and achieves SIM values ranging
from 13 to 32 and SNR values ranging from 13 dB to 24 dB for different water-
marked sounds, which are close to results in [30].

In [32], the authors present an audio watermarking algorithm based on alter-


ing the DC component of the frame of an audio signal. The DFT for each frame
is calculated and the first element of the vector represents the DC component of
the frame. Then, the mean and power content of each frame is calculated. The
DC component of each frame is modified to represent watermark bit. Finally, the
Inverse Discrete Fourier Transform (IDFT) of the frame vector gives the modified
frame. The SNR values range from 20 dB to 54.21 dB for instrumental audio files
against different attacks. While the scheme proves to be robust, robustness can be

25
increased up to a certain level by selecting longer audio files or through insertion
of the watermark signal multiple times in an audio file. Furthermore, upon inter-
preting the results, it has been found that SNR values depend on type of audio
with loud pitch.

A new audio watermarking technique using S-box transformation is presented


in [33]. In it, the audio file is transformed into a matrix and each value pixel is con-
verted into an 8-bit binary value. The S-box transformation is applied to the Least
Significant Bits (LSBs) of each pixel and then this modified matrix is transformed
into an audio file. The MSE yielded a result equal to 11.6538, the PSNR yielded
a result equal to 86.2689, and the Structural Similarity Index Measure (SSIM)
yielded a result equal to 0.9124. This algorithm makes it extremely difficult for
third parties to identify the existence of the watermarks since it encrypts the bits
in a way such that there is no visible pattern.

An audio watermarking algorithm based on LSB is proposed in [34] where the


idea of embedding synchronic signals which use the synchronic signal to search
the watermark location is introduced. First, Logistic chaotic map is utilized to
generate embedding bits for security signal and synchronic signals. Then, the
audio signal is converted to segments and Discrete Wavelet Transform (DWT) is
performed where the hiding bit is embedded in the frequency point of DWT by
replacing LSB. Results show that when the location of the synchronization code
or the watermark increase, the watermarked audio becomes easier to distinguish
from the original audio, with most SNR values between 24 and 29 dB.

26
A novel steganography method is introduced in [35] which is based on LSB ma-
nipulation and inclusion of redundant noise as secret key in the message. In this
method, the high frequency DCT coefficients of the cover audio file are replaced
with the low frequency DCT coefficients of the watermark audio file. This method
exhibited a very high watermark channel capacity; however, the main disadvan-
tages are the extremely low robustness of the method and the unlikelihood that
the embedded watermark would survive digital to analogue or analogue to digital
conversions.

In [36], the authors present a new high bit rate LSB audio watermarking
method. The algorithm is a two-step one that embeds watermark bits into higher
LSB layers resulting in increased robustness. The idea behind this scheme is wa-
termark embedding that causes minimal embedding distortion of the host audio.
Results of both objective and subjective tests demonstrate that this algorithm
outperforms standard LSB insertion algorithm, as presented in [35] and [34], with
higher SNR values and higher perceptual quality of watermarked audio.

A multi-layer steganography scheme is presented in [37] which applies both


cryptography and audio steganography. First, a secret message is encrypted using
the Blowfish algorithm and is then LSB coding is used for steganography which
is done by alternating between the right and left channels of an audio file. The
proposed algorithm applied on classic music resulted in an MSE value of 0.49969
and a PSNR value of 89.0105 which show that it has very good performance.

Another multi-layer steganography scheme is proposed in [38]. The secret mes-

27
sage is encrypted with AES—128 and is then embedded using LSB of the audio
file using a Tan Logistic Map generated sequence. This scheme has a better MSE
(0.06822) and PSNR (101.971) applied on classic music than the scheme in [37]
proving that it is an effective audio steganographic technique.

Similar to the previous schemes, the one in [39] starts with AES—128 encryp-
tion of the secret data. The steganographic layer is carried out by LSB embedding
with an algorithm that utilizes an audio file as a cover where the audio data is pre-
sented as samples with an immense range and the range itself is manipulated and
then the secret data is added to it consequently. This algorithm makes it basically
impossible to extract the secret data without manipulating the range itself. The
obtained PSNR values of the tested music files range from 74.1758 dB to 74.2525
dB which is reasonably high.

In [26], the authors proposed a comparative double-layer steganography scheme


where AES—256, Blowfish, or a Logistic map is used for cryptography and LSB
substitution technique is utilized for the steganography layer to embed a secret
message inside a cover audio. The LSB substitution is done through alternating
between the left and the right channels of the audio cover, respectively. This al-
gorithm, applied to classic music, yields MSE values of 0.453775, 0.453479, and
0.45462 and PSNR values of 91.0097, 91.0126, and 91.0016 using AES—256, Blow-
fish, and a Logistic map, respectively. This algorithm exhibits superior perfor-
mance than [39] and [37] in terms of PSNR.

A new adaptive audio watermarking algorithm based on Empirical Mode De-

28
composition is proposed in [40]. The audio signal is divided into frames where each
frame is decomposed adaptively, by EMD, into intrinsic oscillatory components
called Intrinsic Mode Functions (IMFs). The watermark and the synchronization
codes are then embedded into the extrema of the last IMF of the audio signal,
a very low frequency mode. Simulations are performed on different audio signals
sampled at 44.1 kHz resulting in SNR values above 20 dB and Objective Differ-
ence Grade (ODG) values between -1 and 0. These results demonstrate the good
quality of the watermarked signals.

The authors in [41] propose an audio signal watermarking algorithm based on


the DWT where the spectrum of the host audio signal is decomposed to locate the
most appropriate regions for embedding watermark bits. The algorithm results in
SNR value of 28.5525 for instrumental music and 25.0314 for pop music demon-
strating its audibility and robustness.

In [42], the Singular Value Decomposition (SVD) mathematical technique is


utilized for audio watermarking in both time domain and appropriate transform
domain. The audio signal is transformed into a 2-D format and the SVD algo-
rithm is applied on this 2-D matrix of singular values (SVs) with a small weight.
The proposed algorithm can be implemented on the audio signal as a whole or
on a segment-by-segment basis of which the segment-by-segment implementation
achieves a higher detectability. Results favor the time domain for watermark em-
bedding because of the lowest distortion level associated with it.

The authors in [43] present a blind adaptive audio watermarking algorithm

29
based on SVD in the DWT domain using synchronization code. A watermark is
embedded through the application of a quantization-index-modulation process on
the singular values in the SVD of the wavelet domain blocks. The SNR values of
selected audio files, on which the algorithm was applied, range from 22.11 dB to
26.84 dB, and payload value of 45.9 bps. The scheme exhibits better performance
against MP3 compression compared to other earlier audio watermarking schemes.

In [44], the authors propose a blind audio watermarking scheme combining


SVD and DCT with the synchronization code technique. A binary watermark
is embedded into the high-frequency band of the SVD-DCT block blindly and
chaotic sequence is utilized as the synchronization code and is inserted into the
host signal. This scheme results in a payload value of 43 bps and outperforms the
schemes proposed in [43] and [42] in terms of SNR with a value of 32.53 dB.

Another blind audio watermarking scheme is proposed in [45] which combines


SVD and DWT. A rearranged audio signal is first partitioned into blocks on which
SVD is performed generating the vector by selecting the biggest singular values.
Finally, the watermark is embedded into the approximate components obtained
from the DWT decomposition of the vector through quantization process. This
scheme yields an SNR average result of 25 and a much higher payload result (86
bps) than the schemes proposed in both [43] and [44].

A different watermarking algorithm that is also based on DWT and SVD tech-
niques is proposed in [46]. The authors proposed a new signal framing, DWT
matrix formation and embedding methods and implemented them successfully to

30
enhance the quality of the watermarked audio. The SNR values for different au-
dio files, under different attacks, yield results that range between 38.9659 dB and
47.3899 dB which is higher than the results yielded by the scheme in [45].

The proposed algorithm in [47] is constructed by selecting robust Baker code


as synchronization code, embedding it into the mean value of audio samples and
embedding the watermark into DWT and DCT coefficients by adaptive quantiza-
tion. The results illustrated the robust and inaudible nature of the watermarking
scheme. However, the scheme lacks robustness against pitch invariant time scale
modification.

In [48] an audio watermarking algorithm using two different functions in DWT


domain based on quantization of mean values of low frequency coefficients is pre-
sented. A binary image is used as a watermark and is encrypted with chaotic
encryption using a secret key. The encrypted watermark is embedded in the low
frequency components using a two wavelet functions with adaptation to the frame
size. SNR results range from 46.96 dB to 65 dB and PSNR results range from
19.49 dB to 26.02 dB for different clips, maintaining high audio quality. Moreover,
simulation results show that this approach ensures robustness against common and
malicious attacks.

An innovative watermarking scheme based on double insertion of the water-


mark in DWT-DST domain of the host signal is proposed in [49]. A gray scale
logo image is used as a watermark. The method utilizes the DWT variety of
time-frequency decomposition for audio signals, and modifies its Discrete Sine

31
Transform (DST) coefficients with the watermark image before re-constituting the
signal. SNR results for 8 different watermarked audio signals range from 25.15
dB to 31.07 dB. Moreover, it is a secure scheme due to the usage of secret keys
generated during the watermark insertion process. This proposed scheme showed
robustness and imperceptibility, however, its main drawback would be that the
original signal is required for extracting the watermark, unlike in [46].

In [50], the proposed algorithm utilizes DWT, SVD and secret sharing method
to watermark a given audio. The algorithm was tested on different audios with
different sampling rates and a scaling factor value of 0.04. Different attacks were
stimulated and the accuracy of the algorithm was measured using accuracy rate.
Results show that this technique is robust against different attacks and that in-
creasing the scaling factor improves accuracy, however, at the cost of audibility of
the watermark.

In [46], a wavelet based watermarking scheme is proposed. A single bit is em-


bedded in two subsets of indexes that are randomly generated using two keys. The
algorithm consists of two parts: the embedding part and the detection part. The
embedding process is done according to the mean of each approximation part and
the detection part does not depend on the original audio. Experimental tests have
been conducted to illustrate the robustness and efficiency of the scheme. MSE
results for length of watermark N = 20 is 0.0023, and for N = 30 is 0.0058, illus-
trating superior results than schemes in [37] and [38].

A novel robust audio watermarking scheme based on Lifting Wavelet Transform

32
(LWT) and SVD is proposed in [51]. The watermark data is inserted into the LWT
coefficients of the low frequency sub-band taking advantage of SVD, Quantization
Index Modulation (QIM) and synchronization code technique. The utilization
of QIM makes this scheme blind in nature. Experimental results show that this
scheme is inaudible, robust to general signal processing and desynchronization at-
tacks with SNR values all above 20 dBs. Moreover, the scheme outperforms the
scheme proposed in [27]

In [52], a blind digital audio watermarking scheme based on Discrete Fractional


Sine Transform (DFSRT) is proposed. The characteristics of DFRST renders con-
struction of a secure watermark possible; hence, the propose scheme extensively
adopts chaotic sequences to enhance security. The results of this scheme proved
inaudible and robust; however, its drawback is that it is not robust against random
cropping and time scale modification.

An intelligent audio watermarking scheme based on both DWT and K-Nearest


Neighbor (KNN) techniques is proposed in [53]. The scheme modifies the energy
level in the wavelet domain and embeds the watermark based on that. Further-
more, an intelligent KNN learning machine is trained to capture the correlation
between the modified frequency coefficients and the watermark sequence. The
scheme inserts a chaotic sequence to preserve data synchronization. The results of
SNR average at 41.17 dB exhibiting superiority to the scheme in [52].

The authors in [54] propose a novel robust, transparent and high-capacity bling
audio watermarking scheme. Watermark embedding is performed through modu-

33
lating the vectors in the DCT domain subject to an auditory masking constraint,
implemented on a frame basis. The resulting payload capacity is as high as 848
bps which is much higher than in [43], [44] and [45]. However, the SNR is as low
as 17.51 dB, which is a poor result.

2.6 Technology Comparison

In this section, I briefly mention five different programming languages that can
be utilized for implementing an audio watermarking scheme. These include MAT-
LAB, Python, Java, Wolfram Mathematica, and Maple.

MATLAB is a high-level programming language and numeric computing plat-


form utilized in analyzing data, developing algorithms and creating models. The
name MATLAB stands for matrix laboratory. It was written to provide access to
matrix software developed by the LINPACK and EISPACK projects. It can be
used in every facet of computational mathematics. MATLAB requires an expen-
sive license to download and utilize.

Python is an interpreted high-level general-purpose programming language


with an object-oriented approach aiming to help programmers write code for small
and large-scale projects. Python’s syntax is simple and easy to learn which em-
phasizes readability and reduces the cost of program maintenance. The Python
interpreter and the extensive standard library are available without charge for all
major platforms and can be freely distributed.

34
Java is a general-purpose, class-based, object-oriented programming language
and computing platform released in 1995 by Sun Microsystems. In comparison
with other programming languages, it is fairly easy to learn. Moreover, it provides
a reliable platform to build services, products and applications. It is free to down-
load for personal use as well as for development.

Wolfram Mathematica is a software system with built-in libraries for techni-


cal computing which allows machine learning, statistics, symbolic computation,
manipulating matrices, plotting various types of data, implementing algorithms,
creating user interfaces, and interfacing with programs written in other program-
ming languages. It was released in 1988 by Wolfram Research and requires an
expensive license to download and utilize.

Maple is a math software released in 1982. It combines the world’s most pow-
erful math engine with a user-friendly interface that permits easy analysis, explo-
ration, visualization and solving of mathematical problems. It also requires an
expensive license to download and utilize, although free trials exist.

Python is a popular and simple language designed to be readable and general


purpose. It is the easiest and simplest to learn, however, it is much slower than
the other programming languages. Applications specific to mathematics are not
general purpose; hence, the language is weak in the sense that symbolic compu-
tation is not one of Python’s native features. Compared to Java, MATLAB has
much more support for high-level mathematics operations, but it runs much slower
than Java. Mathematica is better at handling numerical work than all the other

35
mentioned languages. Symbolic manipulation is easier in Mathematica and its
user interface is simpler and easier to build than in MATLAB. Also, Mathemat-
ica is better for handling calculus and differential equations. On the other hand,
MATLAB is more data-oriented and better in design functions than Mathematica.
Furthermore, Mathematica is a universal natural language that can be used for
any programming structure, whereas Maple is a software tool utilized to perform
mathematical calculations only.

Wolfram Research, in partnership with the Egyptian government, offers all


Egyptian citizens free educational access to Mathematica software under the Egyp-
tian Knowledge Bank program. Therefore, owing to the accessibility of the soft-
ware, its natural language and all of the above mentioned benefits over the other
programming languages; I chose to utilize Wolfram Mathematica to implement my
own audio watermarking scheme.

36
Chapter 3

Watermarking Audio Files with


Copyrights

In this chapter, I discuss my approach to solve the business problem of this the-
sis: watermarking audio files with copyrights for copyright protection. Countless
websites have been illegally uploading audio files disregarding ethics, morals and
copyright laws. The solution proposed in this thesis is digitally watermarking
audio files with copyrights so that owners of such files can be protected and ap-
propriate measures taken against anyone who illegally abuses the owners’ rights.
Furthermore, the performance of the scheme is evaluated and compared with other
schemes from the literature.

3.1 Proposed Scheme

In this section, I propose a digital watermarking scheme for copyright protection


of audio files.

37
The scheme proposed is a double layer message security scheme with cryptog-
raphy in the first layer and steganography in the second layer. The scheme starts
with secret descriptive text, a cover audio file and a 2D Tan Logistic Map. First
of all, the audio cover is changed into left and right channels. In the cryptography
layer, the secret text is encrypted using a chaotic function, a Varied Tan Logistic
Map, at the sender side and decrypted in the same way at the receiver side. The
steganography layer involves generating 2 sequences from the 2D Tan Logistic Map
which are used to embed the secret message, using LSB substitution, in the cover
audio’s left and right channels. The scheme is explained in further detail in the
following sections.

The proposed scheme is summarized in Fig. 3.1 and Fig. 3.2.

38
Figure 3.1: The proposed watermarking scheme for audio files.

Figure 3.2: The resulting sequences from the 2D Tan Logistic Map.

39
3.1.1 Data Encryption and Embedding

First of all, the cover audio is converted into signed-integers of 16 bits with a range
between - 215 and 215 - 1. Owing to the fact that the audio data range is negative,
a value of 215 is added to extend the range to 0 to (215 - 1) + 215 . This addition is
done to be able to convert the samples of the cover audio into binary. Then, the
cover audio file generates left and right channels.

Next, the message intended to be embedded in the cover audio is encrypted


using a chaotic function, namely a Varied Tan Logistic Map, with the following
iterative equation
xn+1 = 3.6 × tan[x(n)] × [1 − x(n)] (3.1)

where xn is 0.5 for the proposed scheme. This results in a random sequence of
real numbers. A threshold is chosen, with a value of 0.6 for the proposed scheme,
which is compared to the generated real numbers to produce the secret key bits
as follows 

0, if xn+1 < threshold


bi = (3.2)

1, if xn+1 > threshold

This process iterates until a key with the same length of the binary stream of
the secret message used is generated. Once the key is produced, it is XORed with
the binary stream of the secret message, generating the encrypted secret message.

Before the encrypted secret message is embedded in the cover audio file, 2
chaotic sequences are generated from 2D Tan Logistic Map which can be seen as
two equations as follows

xn+1 = tan[(πry n + 3)(xn (1 – xn ))] (3.3)

40
and
y n+1 = tan[(πrxn + 3)(y n (1 – y n ))] (3.4)

where r is a control parameter, r ∈ [0,1] and xn , y n are the output chaotic


sequence values. These sequences are floating points which are converted into in-
tegers by multiplication by 107 and taking the whole part of the number, to the
left of the decimal point. Thus, resulting in a sequence of output integers to be
used as the key for embedding the secret message in the audio file. Fig. 3.2 shows
the output from the 2D Tan Logistic Map sequence. The encrypted secret mes-
sage is embedded in the LSB, the 16th bit, of the two channels where xn is the
sequence used for the left channel embedding and y n is the sequence used for the
right channel.

Finally, the previously added 215 is subtracted from the sample values forming
the steganograph-ied audio file.

3.1.2 Data Extraction and Decryption

For extraction of the secret message, 215 is first added to the stegonagraph-ied
audio samples and are then converted to binary values. Then, the LSBs are ex-
tracted using the 2D Tan Logistic Map sequences in order to retrieve the binary
form of the encrypted secret message.

Finally, the encrypted secret message is decrypted using the chaotic function,
Varied Tan Logistic Map, used in encryption to recover the secret message. The
key for decryption is generated using the same equation used for encryption with

41
the same parameters, in other words, the encryption and decryption keys are iden-
tical and can be easily generated. Then, the key is XORed with the encrypted
secret message producing the secret message itself.

3.2 Implementation

This section presents the code used for the proposed watermarking scheme. The
scheme is implemented on Wolfram Mathematica® 13.0.

3.2.1 Encryption and Embedding

First, we generate our cover audio.

In[1]:= CoverAudio = Import ["C:\\Users\\Laila\\Desktop


\\Bachelor\\ClassicCover.wav","Audio"]

Out[1]=

Then, the cover audio is converted into signed-integers of 16 bits.

In[2]:= a = AudioData[CoverAudio, "SignedInteger16"];

42
In[3]:= Length[a]

Out[3]= 2

In[4]:= a[[1]];

In[5]:= a[[2]];

Since the data range is negative, an addition of 215 is done.

In[6]:= aP = a + 2^15;

In[7]:= Length[aP]

Out[7]= 2

The cover audio generates left and right channels.

In[8]:= binLeft = Flatten[IntegerDigits[Flatten[aP[[1]]], 2, 16]];

In[9]:= binRight = Flatten[IntegerDigits[Flatten[aP[[2]]], 2, 16]];

In[10]:= binLeft == binRight

Out[10]= False

Then, we generate our secret message.

In[11]:= message =
StringTake[ExampleData[{"Text", "AliceInWonderland"}],30000]

43
Out[11]= I--DOWN THE RABBIT-HOLE Alice was beginning
to get very tired of sitting by her sister
on the bank, and of having nothing to do.
Once or twice she had peeped into the book
her sister was reading, but it had no pictures
or conversations in it, "and what is the use
of a book," thought Alice, "without pictures
or conversations?" So she was considering in
her own mind (as well as she could, for the day
made her feel very sleepy and stupid), whether
the pleasure of making a daisy-chain would be
worth the trouble of getting up and picking the
daisies, when suddenly a White Rabbit with pink
eyes ran close by her . . .

We get the stream of bits of the message.

In[12]:= MsgASCII = ToCharacterCode[message];

In[13]:= MsgBits = Flatten[IntegerDigits[MsgASCII, 2, 8]];

We prepare the key by using tan logistic map function which is used to encrypt
the message.

In[14]:= MsgLength = Length[Flatten[IntegerDigits[MsgBits, 2]]]

Out[14]= 240000

In[15]:= sol = RecurrenceTable[{x[n + 1] == 3.6*Tan[x[n]]*(1 - x[n]),


x[0] == 0.5}, x, {n, 1, MsgLength}];

44
We generate the key with the same length of the message bits with a threshold
value of 0.6.

In[16]:= key = ConstantArray[0, solLength];


For[i = 1, i < solLength + 1, i++,
If[sol[[i]] < 0.6, key[[i]] = 0, key[[i]] = 1]]

Out[16]= 1, 0, 0, 1, 1, 1, 1, 1, 1,
0, 0, 1, 0, 1, 1, 0, 1, 0,
1, 0, 0, 1, 1, 0, 0, 1, 0,
0, 1, 1, 0, 1, 0, 1, 1, . . .

The generated key is used to encrypt the message.

In[17]:= encSecMsg = BitXor[key, MsgBits];

Next, we generate 2 chaotic sequences to be used for embedding the secret


message into the left and right channels of the cover audio.

In[18]:= s1 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceA.xlsx"];
s2 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceB.xlsx"];
s3 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceC.xlsx"];
s4 = Import[

45
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceD.xlsx"];
s5 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceE.xlsx"];
s6 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceF.xlsx"];
s7 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceG.xlsx"];
s8 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceH.xlsx"];
s9 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceI.xlsx"];
s10 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceJ.xlsx"];
s11 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceK.xlsx"];
s12 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio

46
\\FinalSequenceL.xlsx"];
s13 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceM.xlsx"];
s14 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceN.xlsx"];
s15 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceO.xlsx"];
s16 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceP.xlsx"];
s17 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceQ.xlsx"];
s18 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceR.xlsx"];
s19 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceS.xlsx"];
s20 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceT.xlsx"];

47
s21 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceU.xlsx"];
s22 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceV.xlsx"];
s23 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceW.xlsx"];
s24 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceX.xlsx"];
s25 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceY.xlsx"];
s26 = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceZ.xlsx"];

In[19]:= ss1 = Flatten[s1];


ss2 = Flatten[s2];
ss3 = Flatten[s3];
ss4 = Flatten[s4];
ss5 = Flatten[s5];
ss6 = Flatten[s6];
ss7 = Flatten[s7];

48
ss8 = Flatten[s8];
ss9 = Flatten[s9];
ss10 = Flatten[s10];
ss11 = Flatten[s11];
ss12 = Flatten[s12];
ss13 = Flatten[s13];
ss14 = Flatten[s14];
ss15 = Flatten[s15];
ss16 = Flatten[s16];
ss17 = Flatten[s17];
ss18 = Flatten[s18];
ss19 = Flatten[s19];
ss20 = Flatten[s20];
ss21 = Flatten[s21];
ss22 = Flatten[s22];
ss23 = Flatten[s23];
ss24 = Flatten[s24];
ss25 = Flatten[s25];
ss26 = Flatten[s26];

In[20]:= seqL = Join[ss1, ss2, ss3, ss4, ss5, ss6, ss7, ss8,
ss9, ss10, ss11, ss12, ss13, ss14, ss15, ss16, ss17,
ss18, ss19, ss20, ss21, ss22, ss23, ss24, ss25, ss26];

In[21]:= s1R = Import[


"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceAR.xlsx"];

49
s2R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceBR.xlsx"];
s3R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceCR.xlsx"];
s4R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceDR.xlsx"];
s5R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceER.xlsx"];
s6R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceFR.xlsx"];
s7R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceGR.xlsx"];
s8R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceHR.xlsx"];
s9R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceIR.xlsx"];
s10R = Import[

50
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceJR.xlsx"];
s11R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceKR.xlsx"];
s12R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceLR.xlsx"];
s13R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceMR.xlsx"];
s14R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceNR.xlsx"];
s15R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceOR.xlsx"];
s16R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequencePR.xlsx"];
s17R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceQR.xlsx"];
s18R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio

51
\\FinalSequenceRR.xlsx"];
s19R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceSR.xlsx"];
s20R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceTR.xlsx"];
s21R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceUR.xlsx"];
s22R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceVR.xlsx"];
s23R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceWR.xlsx"];
s24R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceXR.xlsx"];
s25R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceYR.xlsx"];
s26R = Import[
"C:\\Users\\Laila\\Desktop\\Bachelor\\Audio
\\FinalSequenceZR.xlsx"];

52
In[22]:= ss1R = Flatten[s1R];
ss2R = Flatten[s2R];
ss3R = Flatten[s3R];
ss4R = Flatten[s4R];
ss5R = Flatten[s5R];
ss6R = Flatten[s6R];
ss7R = Flatten[s7R];
ss8R = Flatten[s8R];
ss9R = Flatten[s9R];
ss10R = Flatten[s10R];
ss11R = Flatten[s11R];
ss12R = Flatten[s12R];
ss13R = Flatten[s13R];
ss14R = Flatten[s14R];
ss15R = Flatten[s15R];
ss16R = Flatten[s16R];
ss17R = Flatten[s17R];
ss18R = Flatten[s18R];
ss19R = Flatten[s19R];
ss20R = Flatten[s20R];
ss21R = Flatten[s21R];
ss22R = Flatten[s22R];
ss23R = Flatten[s23R];
ss24R = Flatten[s24R];
ss25R = Flatten[s25R];

53
ss26R = Flatten[s26R];

In[23]:= seqR = Join[ss1R, ss2R, ss3R, ss4R, ss5R, ss6R, ss7R,


ss8R,ss9R, ss10R, ss11R, ss12R, ss13R, ss14R, ss15R,
ss16R, ss17R, ss18R, ss19R, ss20R, ss21R, ss22R, ss23R,
ss24R, ss25R, ss26R];

We embed the secret message in the LSB of the left and right channels of the
cover audio using the two chaotic functions generated above.

In[24]:= binLeftP = Partition[binLeft , 16];

In[25]:= binRightP = Partition[binRight, 16];

In[26]:= For[j = 1, j <= Length[encSecMsg], j++,


binLeftP[[seqL[[j]], 16]] = encSecMsg[[j]]]

In[27]:= For[y = 1, y <= Length[encSecMsg], y++,


binRightP[[seqR[[y]], 16]] = encSecMsg[[y]]]

Finally, we subtract the previously added 215 forming the steganograph-ied


audio file.

In[28]:= binLeftNP = Flatten [binLeftP];

In[29]:= binRightNP = Flatten [binRightP];

In[30]:= bin = Join[binLeftNP, binRightNP];

In[31]:= c = Partition[bin, 16];

54
In[32]:= f = (FromDigits[#, 2] & /@ c) - 2^15;

In[33]:= SamplesPerChannel = Length[a[[1]]]

Out[33]= 2646000

In[34]:= AudioStego = Partition[f, SamplesPerChannel];

In[35]:= AudioStego == a

Out[35]= False

After finishing the embedding process, we get the steganograph-ied audio file.

In[36]:= StegoAudio = Audio[AudioStego, "SignedInteger16"]

Out[36]=

In[37]:= Export["StegoAudio.wav", StegoAudio];

3.2.2 Extraction and Decryption

The extracting process in the inverse process of the embedding process. First, 215
is added to the stegonagraph-ied audio samples to be able to convert them back
to binary.

55
In[38]:= aEnc = AudioData[StegoAudio, "SignedInteger16"];

In[39]:= aPEnc = aEnc + 2^15;

In[40]:= binLeftEnc = Flatten[IntegerDigits[Flatten[aPEnc[[1]]], 2, 16]];

In[41]:= binRightEnc = Flatten[IntegerDigits[Flatten[aPEnc[[2]]], 2, 16]];

In[42]:= binLeftEncP = Partition[binLeftEnc, 16];

In[43]:= binRightEncP = Partition[binRightEnc, 16];

In[44]:= binLeftEncP == binLeftP

Out[44]= True

In[45]:= binRightEncP == binRightP

Out[45]= True

The LSBs are extracted using the chaotic sequences previously generated to
retrieve the binary form of the encrypted secret message.

In[46]:= secMessageLeft = ConstantArray[0, Length[encSecMsg]];

In[47]:= For[k = 1, k <= Length[encSecMsg], k++,


secMessageLeft[[k]] = binLeftEncP[[seqL[[k]], 16]]]

In[48]:= encSecMsg == secMessageLeft

Out[48]= True

In[49]:= secMessageRight = ConstantArray[0, Length[encSecMsg]];

56
In[50]:= For[r = 1, r <= Length[encSecMsg], r++,
secMessageRight[[r]] = binRightEncP[[seqR[[r]], 16]]]

In[51]:= encSecMsg == secMessageRight

Out[51]= True

Then, we decrypt the encrypted secret message with the generated key.

In[52]:= DecryptedMessage = BitXor[encSecMsg, key];

We convert the decrypted message into ASCII.

In[53]:= Groupping = Partition[DecryptedMessage, 8];

In[54]:= GrouppingLength = Length[Groupping];

In[55]:= MessageBackAscii = ConstantArray[0, GrouppingLength];


For[j = 1, j < GrouppingLength + 1, j++,
MessageBackAscii[[j]] = FromDigits[Groupping[[j]], 2]];

Finally, we convert the ASCII codes into our secret message.

In[56]:= MessageBack = FromCharacterCode[%]

Out[56]= I--DOWN THE RABBIT-HOLE Alice was beginning


to get very tired of sitting by her sister
on the bank, and of having nothing to do.
Once or twice she had peeped into the book
her sister was reading, but it had no pictures
or conversations in it, "and what is the use

57
of a book," thought Alice, "without pictures
or conversations?" So she was considering in
her own mind (as well as she could, for the day
made her feel very sleepy and stupid), whether
the pleasure of making a daisy-chain would be
worth the trouble of getting up and picking the
daisies, when suddenly a White Rabbit with pink
eyes ran close by her . . .

3.3 Numerical Results

For the implementation of the proposed scheme on Wolfram Mathematica® 13.0,


a machine with a Microsoft Windows® 10 Pro operating system and an Intel®
CoreTM i7-5500U CPU @ 2.40 GHz processor with 16 GB of RAM is used. The
cover audio file used is a classic music genre 1-minute .wav file with a sampling
rate of 44.1 kHz. The secret message used for embedding as a watermark inside
the cover audio is the first 30,000 characters from the novel Alice in W onderland.

The performance metrics previously mentioned in Chapter 2 are applied on the


cover audio file and the steganograph-ied audio file. This section presents the re-
sults and compares the proposed audio watermarking scheme’s performance with
similar schemes from the literature.

The following links contain the audio file before and after watermarking with
the secret message using the proposed scheme.
Cover audio: shorturl.at/dfuTV

58
Watermarked audio: shorturl.at/etB39
After the hearing test, it is evident that there are no audible differences between
both files, which shows the effectiveness of the proposed scheme.

The waveform plots for the left and right channels for the cover audio file and
the steganograph-ied audio file are shown in Fig. 3.3 and Fig. 3.4. Through ob-
serving the plots with the human eye, it can be deduced that they are identical
with no observable differences showing how the proposed scheme’s steganographic
ability is superior.

Figure 3.3: Waveform plot of the left and right channels of the cover audio file.

Figure 3.4: Waveform plot of the left and right channels of the watermarked audio
file.

59
Table 3.1: MSE and PSNR values employing the proposed algorithm.
MSE PSNR
Proposed scheme 0.0453658 101.011

Table 3.2: Intensity properties.


File Power RMS Amplitude Loudness
Cover 0.00379432 0.061598 0.0238792
Stego 0.00379432 0.061598 0.0238792

Table 3.1 shows the values of MSE and PSNR obtained from (2.1) and (2.2)
respectively. The values show very good performance of the proposed scheme with
a low value for MSE and a high value for PSNR.

Table 3.2 compares the intensity properties of the cover audio and the steganograph-
ied audio files where the results are identical for both files across the 3 metrics.

Table 3.3 compares the time–domain properties of the cover audio and the
steganograph-ied audio files where the results are identical for both files across the
metrics except for the ZC value.

Table 3.4 compares the frequency–domain properties of the cover audio and
the steganograph-ied audio files where the results are near-identical for both files
across all 7 metrics.

Table 3.5 compares the basic histogram properties of the cover audio and the
steganograph-ied audio files. The results are identical for both files, except for the

60
Table 3.3: Time-domain properties.
File CF PAPR TC ZC
Cover 9.28116 86.1399 0.635737 98502.0
Stego 9.28116 86.1399 0.635737 98514.0

Table 3.4: Frequency-domain properties.


File Centroid Crest Flatness Kurtosis Roll off Skewness Spread
Cover 648.907 2832.89 0.000078 226.405 1124.92 11.0805 628.509
Stego 648.907 2832.89 0.000081 226.406 1124.92 11.0806 628.51

mean value.

Table 3.6 compares between the information entropy of the cover audio and the
steganograph-ied audio files. The cover audio file shows a slightly lower entropy
value since it contains less information than its watermarked version.

To conclude the evaluation of the proposed scheme, a comparison is conducted


with other proposed schemes from the literature in terms of PSNR. The schemes
chosen for the comparison are the ones in [26] and [38]. Table 3.7 shows the PSNR
values of all the proposed schemes. The scheme proposed in this thesis has a better
performance than both schemes proposed in [26] and a very similar performance
to the scheme proposed in [38]. One explanation for these results would be that
the proposed scheme and the scheme in [38] share the same secret message length
of 30,000 characters, while the schemes proposed in [26] utilize a secret message
with a length of 600,000 characters.

61
Table 3.5: Basic histogram properties.
File Max Min Mean Median SD
Cover 0.571701 -0.529785 -0.0000308313 0.0 0.061598
Stego 0.571701 -0.529785 -0.000030731 0.0 0.061598

Table 3.6: Information entropy values.


File Information Entropy
Cover 14.3687
Stego 14.3879

Overall, the scheme proposed in this thesis exhibits very good performance.

62
Table 3.7: A comparison between the PSNR value obtained from the proposed
scheme with the PSNR values obtained from other schemes in the literature.

Scheme in [38] Scheme in [26] Scheme in [26] us- Proposed


using AES-128 using AES-256 ing a Logistic map scheme
PSNR 101.971 91.0097 91.0016 101.011

3.4 Conclusions

In this chapter, I have proposed and discussed the mechanism I used to address
the business problem of this thesis: copyright protection of audio files. I have used
a 1-minute .wav classic music genre audio file with a sampling rate of 44.1 kHz as
my cover audio for a double layer message security scheme with cryptography in
the first layer and steganography in the second layer. The secret message, acting
as the watermark, that was embedded in the cover audio was the first 30,000 char-
acters from the novel Alice in W onderland. The results have shown almost no
differences between the cover audio file and the steganograph-ied audio file proving
the success and effectiveness of my proposed scheme.

63
Chapter 4

Conclusions

In this thesis, I have introduced the business problem of copyright infringement


of audio files. After introducing the problem and giving overviews of all related
concepts and background, I have discussed the relevant literature. To solve the
business problem of this thesis, I have proposed watermarking audio files with
copyrights. My proposed audio watermarking scheme was a double layer message
security scheme with cryptography in the first layer and steganography in the
second layer. Cryptography involved using a chaotic function, namely a Varied
Tan Logistic Map, to encrypt the secret message, in other words the watermark,
for added security. Steganography involved generating 2 sequences from a 2D Tan
Logistic Map that are used to embed the secret message into the chosen cover
audio file. Embedding was done using the LSB substitution technique in the cover
audio’s left and right channels. In addition to that, I performed data extraction
and decryption to get back the secret message using the same techniques. The
proposed watermarking scheme was implemented on Wolfram Mathematica® 13.0.
To evaluate the performance of my proposed scheme, I used a number of visual and
statistical metrics. The results have proven that the performance of the scheme

64
is very good with barely no differences between the audio files before and after
watermarking. Furthermore, I have compared my results with those of similar
schemes from the literature and found them to be very comparable.

65
References

[1] I. T. Hardy, “Criminal copyright infringement,” Wm. & Mary Bill Rts. J.,
vol. 11, p. 305, 2002.

[2] A. B. Froblich, “Copyright infringement in the internet age-primetime for


harmonized conflict-of-laws rules,” Berkeley Tech. LJ, vol. 24, p. 851, 2009.

[3] J. C. Ginsburg, “International copyright: From a bundle of national copyright


laws to a supranational code,” J. Copyright Soc’y USA, vol. 47, p. 265, 2000.

[4] M. N. Sadiku, T. J. Ashaolu, A. Ajayi-Majebi, and S. M. Musa, “Digital


piracy,”

[5] R. K. Sinha and N. Mandel, “Preventing digital music piracy: The carrot or
the stick?” Journal of Marketing, vol. 72, no. 1, pp. 1–15, 2008.

[6] C. Koester, “Combating music piracy: The recording industry’s legal pursuit
of online copyright infringers,” 2009.

[7] S. E. Siwek, The true cost of sound recording piracy to the us economy, 2007.

[8] B. Fries and M. Fries, Digital audio essentials. ” O’Reilly Media, Inc.”, 2005.

[9] S. K. Dubey and V. Chandra, “Steganography, cryptography and water-


marking: A review,” International Journal of Innovative Research in Science,
Engineering and Technology, vol. 6, no. 2, pp. 2595–2599, 2017.

66
[10] Y. Lin and W. H. Abdulla, “Audio watermarking for copyright protection,”
University of Auckland, Auckland, New Zealand, Tech. Rep, 2007.

[11] M. Pal, “A survey on digital watermarking and its application,” Interna-


tional Journal of Advanced Computer Science and Applications, vol. 7, no. 1,
pp. 153–156, 2016.

[12] D. Shukla and N. Tiwari, “Survey on digital watermarking techniques,”


International Journal of Signal Processing, Image Processing and Pattern
Recognition, vol. 8, no. 9, pp. 121–126, 2015.

[13] A. F. Qasim, F. Meziane, and R. Aspin, “Digital watermarking: Applicability


for developing trust in medical imaging workflows state of the art review,”
Computer Science Review, vol. 27, pp. 45–60, 2018.

[14] R. Patel and P. Bhatt, “A review paper on digital watermarking and its
techniques,” International Journal of Computer Applications, vol. 110, no. 1,
pp. 10–13, 2015.

[15] N. R. Zhou, W. M. X. Hou, R. H. Wen, and W. P. Zou, “Imperceptible


digital watermarking scheme in multiple transform domains,” Multimedia
Tools and Applications, vol. 77, no. 23, pp. 30 251–30 267, 2018.

[16] S. Czerwinski, R. Fromm, and T. Hodes, “Digital music distribution and


audio watermarking,” UCB IS, vol. 219, 2007.

[17] F. Y. Shih, Digital watermarking and steganography: fundamentals and tech-


niques. CRC press, 2017.

[18] A Mohanarathinam, S Kamalraj, G. P. Venkatesan, R. V. Ravi, and C.


Manikandababu, “Digital watermarking techniques for image security: A re-

67
view,” Journal of Ambient Intelligence and Humanized Computing, pp. 1–9,
2019.

[19] M. A. Alsalami and M. M. Al-Akaidi, “Digital audio watermarking: Survey,”


School of Engineering and Technology, De Montfort University, UK, p. 19,
2003.

[20] M. Arnold, “Audio watermarking: Features, applications and algorithms,”


in 2000 IEEE International conference on multimedia and expo. ICME2000.
Proceedings. Latest advances in the fast changing world of multimedia (cat.
no. 00TH8532), IEEE, vol. 2, 2000, pp. 1013–1016.

[21] ——, “Analysis of risks and attacks on digital audio watermarks,” Journal
of New Music Research, vol. 34, no. 2, pp. 197–208, 2005.

[22] S. P. Vaidya and P. C. Mouli, “Adaptive digital watermarking for copyright


protection of digital images in wavelet domain,” Procedia Computer Science,
vol. 58, pp. 233–240, 2015.

[23] M Agbaje and A. Adebayo, “Robustness and security issues in digital au-
dio watermarking,” International Journal of Engineering and Information
Systems (IJEAIS), vol. 1, pp. 1–10, 2017.

[24] A. Dixit and R. Dixit, “A review on digital image watermarking techniques,”


International Journal of Image, Graphics and Signal Processing, vol. 9, no. 4,
p. 56, 2017.

[25] G. M. Nita, “Spectral kurtosis statistics of transient signals,” Monthly No-


tices of the Royal Astronomical Society, vol. 458, no. 3, pp. 2530–2540, 2016.

68
[26] F. Hemeida, W. Alexan, and S. Mamdouh, “A comparative study of audio
steganography schemes,” International Journal of Computing and Digital
Systems, vol. 10, pp. 555–562, 2021.

[27] I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon, “Secure spread spec-


trum watermarking for multimedia,” IEEE transactions on image processing,
vol. 6, no. 12, pp. 1673–1687, 1997.

[28] P. K. Dhar and J.-M. Kim, “Digital watermarking scheme based on fast
fourier transformation for audio copyright protection,” International Journal
of Security and Its Applications, vol. 5, no. 2, pp. 33–48, 2011.

[29] X. Wen, X. Ding, J. Li, L. Gao, and H. Sun, “An audio watermarking al-
gorithm based on fast fourier transform,” in 2009 International Conference
on Information Management, Innovation Management and Industrial Engi-
neering, IEEE, vol. 1, 2009, pp. 363–366.

[30] P. K. Dhar, M. I. Khan, and J Kim, “A new audio watermarking system using
discrete fourier transform for copyright protection,” International journal of
computer science and network security, vol. 10, no. 6, pp. 35–40, 2010.

[31] P. K. Dhar, M. I. Khan, and S. Ahmad, “A new dct-based watermarking


method for copyright protection of digital audio,” International Journal of
Computer Science & Information Technology (IJCSIT), vol. 2, no. 5, pp. 91–
97, 2010.

[32] M. Patil and J Chitode, “Audio watermarking: A way to copyright protec-


tion,” International Journal of Engineering, vol. 1, no. 6, pp. 1–6, 2012.

69
[33] I. Hussain, “A novel approach of audio watermarking based on s-box trans-
formation,” Mathematical and Computer Modelling, vol. 57, no. 3-4, pp. 963–
969, 2013.

[34] Y. Xiong et al., “Covert communication audio watermarking algorithm based


on lsb,” in 2006 International Conference on Communication Technology,
IEEE, 2006, pp. 1–4.

[35] A. Chadha and N. Satam, “An efficient method for image and audio steganog-
raphy using least significant bit (lsb) substitution,” arXiv preprint arXiv:1311.1083,
2013.

[36] N. Cvejic and T. Seppanen, “Increasing robustness of lsb audio steganog-


raphy using a novel embedding method,” in International Conference on
Information Technology: Coding and Computing, 2004. Proceedings. ITCC
2004., IEEE, vol. 2, 2004, pp. 533–537.

[37] F. Hemeida, W. Alexan, and S. Mamdouh, “Blowfish–secured audio steganog-


raphy,” in 2019 Novel Intelligent and Leading Emerging Sciences Conference
(NILES), IEEE, vol. 1, 2019, pp. 17–20.

[38] M. T. Elkandoz and W. Alexan, “Logistic tan map based audio steganogra-
phy,” in 2019 international conference on electrical and computing technolo-
gies and applications (ICECTA), IEEE, 2019, pp. 1–5.

[39] R. Hussein and W. Alexan, “Secure message embedding in audio,” in 2019


2nd International Conference on Computer Applications & Information Se-
curity (ICCAIS), IEEE, 2019, pp. 1–6.

70
[40] K. Khaldi and A.-O. Boudraa, “Audio watermarking via emd,” IEEE trans-
actions on audio, speech, and language processing, vol. 21, no. 3, pp. 675–
680, 2012.

[41] A. Al-Haj, A. A. Mohammad, and L. Bata, “Dwt-based audio watermark-


ing.,” Int. Arab J. Inf. Technol., vol. 8, no. 3, pp. 326–333, 2011.

[42] F. E. Abd El-Samie, “An efficient singular value decomposition algorithm


for digital audio watermarking,” International Journal of Speech Technology,
vol. 12, no. 1, p. 27, 2009.

[43] V. Bhat, I. Sengupta, and A. Das, “An adaptive audio watermarking based
on the singular value decomposition in the wavelet domain,” Digital Signal
Processing, vol. 20, no. 6, pp. 1547–1558, 2010.

[44] B. Y. Lei, Y. Soon, and Z. Li, “Blind and robust audio watermarking scheme
based on svd–dct,” Signal Processing, vol. 91, no. 8, pp. 1973–1984, 2011.

[45] H. Zhao, F. Wang, Z. Chen, and J. Liu, “A robust audio watermarking


algorithm based on svd-dwt,” Elektronika Ir Elektrotechnika, vol. 20, no. 1,
pp. 75–80, 2014.

[46] J. Mishra, M. Patil, and J. Chitode, “An effective audio watermarking using
dwt-svd,” International Journal of Computer Applications, vol. 70, no. 8,
2013.

[47] X.-Y. Wang and H. Zhao, “A novel synchronization invariant audio wa-
termarking scheme based on dwt and dct,” IEEE Transactions on signal
processing, vol. 54, no. 12, pp. 4835–4840, 2006.

71
[48] A. Elshazly, M. Fouad, and M. Nasr, “Secure and robust high quality dwt do-
main audio watermarking algorithm with binary image,” in 2012 seventh in-
ternational conference on computer engineering & systems (ICCES), IEEE,
2012, pp. 207–212.

[49] H. Yassine, B. Bachir, and K. Aziz, “A secure and high robust audio water-
marking system for copyright protection,” International journal of computer
applications, vol. 53, no. 17, 2012.

[50] K. R. Kakkirala and S. R. Chalamala, “Digital audio watermarking using


dwt-svd and secret sharing,” International Journal of Signal Processing Sys-
tems, vol. 1, no. 1, 2013.

[51] B. Lei, Y. Soon, F. Zhou, Z. Li, and H. Lei, “A robust audio watermarking
scheme based on lifting wavelet transform and singular value decomposition,”
Signal processing, vol. 92, no. 9, pp. 1985–2001, 2012.

[52] M. Fan and H. Wang, “Chaos-based discrete fractional sine transform do-
main audio watermarking scheme,” Computers & Electrical Engineering,
vol. 35, no. 3, pp. 506–516, 2009.

[53] H. Latifpour, M. Mosleh, and M. Kheyrandish, “An intelligent audio water-


marking based on knn learning algorithm,” International Journal of Speech
Technology, vol. 18, no. 4, pp. 697–706, 2015.

[54] H.-T. Hu and L.-Y. Hsu, “Robust, transparent and high-capacity audio wa-
termarking in dct domain,” Signal Processing, vol. 109, pp. 226–235, 2015.

72

View publication stats

You might also like