Steganography ASCII
Steganography ASCII
a r t i c l e i n f o a b s t r a c t
Article history: In this paper, a flexible dot-pattern based mapping technique for steganographic encoding of mes-
Received 16 May 2017 sages is presented, with the aim that statistical analysis would not reveal the secret message even
Revised 31 July 2017 if the steganographic system is under attack and the hidden bits are revealed from the cover media
Accepted 17 August 2017
used in the steganography technique like text, image etc. To retrieve the message the attacker needs
Available online 19 August 2017
to know the specific mapping scheme used to encode the characters with dot pattern. Thus the
method presented is highly secured compared to traditional steganography schemes where standard
Keywords:
characters to bits mapping techniques are used. The corresponding binary data is generated from a
Character to binary mapping
Cryptography
specific and personal Dot Pattern Character Encoding Scheme (DPCES), which is embedded onto the
Steganography cover media and sent through the public channel to the intending receiver having prior knowledge
Matrix of the communication-specific encoding scheme. We have implemented the proposed DPCES using
Visual Studio 2015, and tested with image covers which show that the stego media can maintain
desired secrecy.
Ó 2017 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an
open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
http://dx.doi.org/10.1016/j.jksuci.2017.08.003
1319-1578/Ó 2017 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
198 S. Mahato et al. / Journal of King Saud University – Computer and Information Sciences 32 (2020) 197–207
embedding algorithm, the secret message is encoded in bit forms. graphics, and text. They developed watermarking codes to be
The bits can be embedded in the text cover, image cover, audio used in applications like copyright protection for digital media,
cover or video cover. Several standard encoding schemes for content authentication, media forensics, data binding, and covert
steganography are used in literature. For text cover, the popular communications.
methods of embedding of bits message are format based steganog- Kim et al. (2006) proposed an efficient information hiding algo-
raphy (Mahato et al., 2014) and linguistic steganography (Desoky, rithm which embeds message in the least significant bits of JPEG
2010); for image cover (Cheddad et al., 2010), the bits are usually coefficients of images. They used a modified matrix encoding tech-
embedded in the least significant bits of a pixel and in audio and nique that embeds information by modifying the coefficients and
video cover (Sadek et al., 2015) similar embedding techniques maximized the introduced distortion. They derived the expected
are used. value of the introduced distortion as a function of the message
This paper presents a flexible case-specific dot-pattern based length and the probability distribution of the JPEG quantization
mapping technique for steganographic encoding of messages to errors of cover images.
conceal secret information in binary notation to increase stegano- Fridrich et al. (2006) proposed an approach to wet paper codes
graphic system security. The rest of the paper is organized as fol- using random linear codes of small co-dimension that improves
lows: Section 2 discusses related works; Section 3 presents the embedding efficiency. They projected coding method to be
proposed method and demonstration of working of approach; Sec- modularly combined with most steganographic schemes to allow
tion 4 shows shortening of bit string length for each character by them to use non-shared selection channels and, at the same time,
representing each character with 6 dots; Section 5 gives perfor- improve their security by decreasing the number of embedding
mance comparison; Section 6 gives security analysis followed by changes. They combined wet paper codes with matrix embed-
conclusion in Section 7. ding. Later, Fridrich and Filler (2007) proposed a general frame-
work and practical coding methods for constructing
steganographic schemes that minimize the statistical impact of
2. Related works embedding. The method was based on syndrome codes with
low-density generator matrices (LDGM). They used the binary
Each language has its own script with large number of charac- quantizer based on low density generator matrices and the survey
ters, which demands a systematic approach to character encoding. propagation algorithm proposed by Wainwright and Maneva
In literature different encoding scheme had developed with differ- (2005).
ent size of bits. Zhang et al. (2007) proposed a method to improve the
In 1623, Francis Bacon (Math 135, 1623; Gaines, 1956) created a embedding efficiency of binary covering functions by fully
cipher system using the techniques of substitution and steganogra- exploiting the information contained in the choice of addition
phy, where a message was concealed in the presentation of text, or subtraction in the embedding. This scheme performed well
rather than it’s content. To encode a message each letter of the with ternary covering functions and without ternary conversion
plain text was replaced by a group of five sequences of binary data. of the message.
In 1821 Louis (1829) developed a system that enables blind and Bierbrauer and Fridrich (2008) described several families of
visually impaired people to read and write using finger touch. It covering codes constructed using the block wise direct sum of
consisted of raised dots arranged in ‘‘cells”. A cell is made up of factorizations. They showed that non-linear constructions offer
six dots that fit under the fingertips, arranged in two columns of better performance compared to simple linear covering codes
three dots each. Each cell represented a letter, a word, a combina- currently used by steganographers. They implemented a selected
tion of letters, a numeral or a punctuation mark. Emile Baudot in code family which showed that certain families of non-linear
1870 created a 5-bit encoding scheme named as Baudot code codes can achieve higher embedding efficiency for applications
(Ralston and Reilly, 1993), which was further improved by Donald in steganography than simple linear codes currently in use. They
Murray in 1901 (William, 1901; Beauchamp, 2001) and standard- constructed the codes using the block-wise direct sum of code
ized by CCITT (Smith, 2001) as International Telegraph Alphabet factorizations.
No. 2 (ITA2) in 1930. Zhang and Zhu (2009) proposed an approach to wet paper
A 6-bit encoding scheme was used by IBM in 1959 named codes by folding the cover into several layers and applying
as Binary Coded Decimal (BCD) (CharacterEncoding) in its basic wet paper coding methods with low computational com-
1401 and 1620 computers, and in its 7000 Series. BCD was plexity to each layer. This method uses the changes introduced
extended to EBCDIC to include alpha-numeric and special in the first-layer to embed messages into every layer and there-
characters as well. It was an eight-bit character encoding used fore achieves high embedding efficiency. Filler et al. (2009) gave
mainly on IBM mainframe and IBM midrange computer oper- a formal proof of laws for imperfect stego systems, assuming
ating systems (EBCDIC). Later, American Standard Code for that the cover source is a stationary Markov chain and the
Information Interchange (ASCII) (Fischer, 1874) 7-bit encoding embedding changes are mutually independent. Later, Filler
scheme was developed in 1963 to encode letters, numerals, et al. (2010) proposed a practical approach to minimize embed-
symbols, and also device control codes as fixed-length code ding impact in steganography based on syndrome coding and
using integers. Different machines having different codes led trellis-coded quantization and contrast its performance with
to problems in exchanging information. Therefore, a UNICODE bounds derived from appropriate rate-distortion bounds. They
(Crippen, 2010) standardization scheme was developed in assumed that each cover element can be assigned a positive
1980 to solve this problem using 16-bit characters instead scalar expressing the impact of making an embedding change
of 8-bit characters, which allowed for 216 (=65,536) distinct at that element.
values, enabling it to represent many different characters from To support more writing systems for different languages
different scripts. required support for a far larger number of characters and
Research is going on to make encoding schemes with better demands a systematic approach to character encoding. Data-
embedding efficiency and lesser noise to be introduced to the hiding codes have a valuable potential role to play in applications
cover media in a steganography system. Moulin and Koetter requiring a higher security. As per the knowledge of the authors,
(2005) presented a review on the theory and design of codes no characters to bit encoding schemes have been solely designed
for hiding information in signals such as images, video, audio, to be used for the purpose of steganography to improve security
S. Mahato et al. / Journal of King Saud University – Computer and Information Sciences 32 (2020) 197–207 199
aspect of a steganography system. The present work addresses the number of dots present in rows/columns, respectively which
this issue. acts as a key. The number of rows/columns may be predefined
by the communication parties involved. In order to generate a
3. Proposed technique binary data the relation between two consecutive dots in the
same row (or same column) may be taken. If two dots (in same
Character encoding technique (Character_encoding) uses row/column) are joined (by line) then it represents 1, otherwise
internationally accepted standard, which permits worldwide 0. Finally, row wise and column wise binary data are taken
interchange of information. In steganography, the use of charac- together to make a meaningful combination of 0 and 1. This com-
ter to bit encoding is not for the purpose of the transmission of bined data may now be embedded in the cover media with any
data through telecommunication networks or for data storage, suitable steganography method before being sent to the public
instead its purpose is to encode the data to convert it in bit channel. The information regarding number of rows or columns
form to be embeddable in the cover medium. Thus, it is not may be mutually agreed before communication or it can be sent
mandatory to use the standard character to bit mapping separately.
scheme in steganography. This can be further improved to make At receiving end, the secret message may be extracted by
it more vigorous by introducing flexible character to binary dividing the received combined data using rows/columns infor-
mapping technique for each communication to make the mation decided earlier (or received by some other means) by
steganography system more secure, as only the sender and matching the two consecutive dots as described earlier. This
receiver will share it. is being programmed and verified by implementing in Visual
Studio 2015. The proposed technique has following
constituents:
3.1. Dot Pattern Character Encoding Scheme (DPCES)
1) Secret message
In this section a novel personal, case-specific character to bit
2) Method to convert the message into dotted form
mapping technique that we abbreviate as DPCES will be pre-
3) Key file having number of dots present rows/columns wise
sented that maps secret data into bit form which can be hidden
4) Cover file (here we are considering image cover)
in the cover media to be used in steganography. To conceal the
5) Embedding algorithm here we are using MATLAB code to
message, the message is first represented in dotted form using
embed the secret bits in image cover using Least Significant
DPCES template 1, as given in Fig. 1, which highly resembles
Bit steganography technique
the normal alphabet. This encoding scheme is said to be personal
6) Extraction algorithm (MATLAB code is being used to extract
as one is free to use one’s scripting language and make one’s own
the secret bits from the stego image)
unique pattern to fit in nine dots which can be used to encode the
7) Method to convert the bits into message
message. The number of rows/columns is counted by counting
A to Z
a to z
a b c d e f g h i j k l m
n o p q r s t u v w x y z
3.1.1. Mapping using Dot Pattern Character Encoding Scheme 3.2.1. Hiding process
In this section we now formally describe the proposed algo- 3.2.1.1. Dotted pattern for our algorithm. We have generated our
rithm. This algorithm has two parts 1) hiding secret message in own following dotted representation for A to Z, a-z, 0–9 and special
binary notation and 2) extracting secret message from binary text character @, underscore and space as follows:
file.
Algorithm for hiding secret message in binary notation:- 3.2.1.2. An example. This section shows an example of encoding a
message ‘ABC’ in bit string using proposed algorithm in Sec-
Step 1 : Select a secret message. tion 3.1.1. First the message is represented in dotted form as given
Step 2 : Agree on a value of columns (rows) for dots. in Fig. 1(following DPCES template 1).
Step 3 : Represent each alphabet in dotted form (as described
earlier) as shown in Fig. 1.
Step 4 : Count the number of rows required to represent the
data in dotted form.
Step 5 : Check the dots layout row-wise (and column wise) and
represent 1, if two consecutive dots are connected, 0 otherwise.
Step 6 : Combine the collected data in step 5. (The text message
is now encoded in binary data).
Step 7 : Hide this binary data in the suitable cover using any Thus the key will be 9 (number of columns); number of row
standard embedding algorithm. is assumed to be fixed as 3. The binary coded data will be
Step 8 : Send the stego file (and/or key) to the receiver. constructed as:-
Algorithm for extracting secret message from binary text file:- Row wise Column wise
110100111101100000011011 110011110110110000
Step 1 : Extract the bit data by applying extraction algorithm on
stego file.
Step 2 : Make dotted matrix for the number of rows and After Merging we get
columns. 110100111101100000011011 110011110110110000
Step 3 : Separate the row data and column data.
Step 4 : By checking row data, if 1 present then connect dots
row wise else do not.
Step 5: By checking column data, if 1 present then connect dots This is a representation of message ‘ABC’ in the binary form
column wise else do not. using DPCES template 1.
Step 6 : Hence we extract the message in dotted form.
Step 7: Convert the dotted form message to actual message. 3.2.2. Revealing process
This section explains the revealing process of hidden message.
3.2. Working of our approach We assume here that the encoded message as sent in Sec-
tion 3.2.1.2 has been received by receiver without any noise/
This section describes the working of our proposed technique. disruption.
3.3. Experimental results perform Least Significant Bit (LSB) steganography to hide the
secret bit strings in Lena image. The Lena image before and after
3.3.1. Implementation of DPCES in image steganography embedding is shown in Fig. 3(a) and (b). The generated stego
In order to simulate the proposed technique, we have devel- image is sent to the receiver through public channel. The receiver
oped software using Visual Studio 2015. This comprises phases, extracts the bit string using the MATLAB code for message extrac-
in the first phase the secret message is converted to binary repre- tion. The bit strings are mapped to the secret message using same
sentation using DPCES scheme as depicted in Fig. 2. In the second mapping tools which is used in the embedding process as
phase this binary string is embedded in a stego cover (here it is depicted in Fig. 4.
image). Here we used MATLAB code for message embedding to
4. Shortening of bit string length for each character
5. Performance comparison
A to Z
a to z
Fig. 5. Proposed dotted representation to shorten bit string length per character (DPCES template 2).
stego-images are shown in Fig. 7(a)–(d), which show that distor- firmed visually as well as through the quantified PSNR measure
tions resulted from embedding are imperceptible to human per- in dB. We have also analyzed the histograms for the intensity of
ception. Table 1 shows the noise comparison between original to the original and stego images after encoding message using differ-
stego images measured by PSNR for the three sample images in ent encoding schemes which are given in Fig. 8(a)–(d). It shows
Fig. 7(a). Thus, we note that the PSNR values using our DPCES tem- that the disparities between the constructed histograms are not
plates 1 & 2 are either same or slightly lower than for the standard quite perceptible.
encoding scheme ASCII. Further, we can state that the personalized The PSNR graph based on different test images before and after
templates used in our technique do not produce significantly dif- message embedding using different encoding schemes is depicted
ferent images compared with standard encoding systems as con- on Fig. 9.
204 S. Mahato et al. / Journal of King Saud University – Computer and Information Sciences 32 (2020) 197–207
6. Security analysis 6.2. Ability to encode a large number of characters for any script
This section analyses the security aspects of the proposed In traditional encoding schemes each character is repre-
DPCES encoding schemes, analyzed in three parts below:- sented separately, thus by observing the bit patterns one
can extract the secret message. The proposed DPCES template
6.1. Security is improved 1 scheme uses 9 dots where each character is represented by
a distinct bit string composed of 12 bits (due to the presence
If a traditional steganography scheme using common encoding of twelve edges). Thus we got 2^12 = 4096 distinct patterns,
scheme is attacked and secret bits get extracted by the intruder, making it possible to represent many different characters from
the message can be revealed by using tools publicly available many different scripts. The DPCES template 2 encoding
online. But, if we use our own personal character to bits encoding scheme uses 6 dots, represented by 7 bits each (due to the
scheme, in attack situations the secret message is not likely to be presence of seven edges); thus it has 2^7 = 128 distinct pat-
revealed as the probability of the attacker or intruder knowing terns, which would be enough to represent many common
the specific encoding scheme is extremely low. Thus it will make script’s characters like English and scripts for Indian
the steganography system very secure. languages.
S. Mahato et al. / Journal of King Saud University – Computer and Information Sciences 32 (2020) 197–207 205
6.3. Secret message is not revealable by attacker messages instead of individual characters in bits present in the
message. Therefore, no one can extract the secret message by pat-
Number of rows (or columns) will work as a security key. Here tern resemblance of the bits and thus its transmission will also not
each character is having specific dot pattern, but, the important raise suspicion in the mind of the intermediary. The encoding
thing is that the complete message bit string is formed by combin- scheme is said to be personal as it does not require a fixed standard
ing row and column wise links between the complete dotted form code for representing any alphabet. It is set by the sender and
206 S. Mahato et al. / Journal of King Saud University – Computer and Information Sciences 32 (2020) 197–207
Table 1
PSNR calculation for comparing the original with the corresponding stego images for the message ‘‘My password is
Susmita@2512” using LSB technique in MATLAB.
Stego images (color size of 512 512) Encoding Scheme PSNR (in dB)
Lena ASCII 107.1214
Baboon ASCII 107.1214
Pepper ASCII 107.1214
Lena DPCES (template 1) 102.3502
Baboon DPCES (template 1) 102.3502
Pepper DPCES (template 1) 102.3502
Lena DPCES (template 2) 107.1214
Baboon DPCES (template 2) 107.1214
Pepper DPCES (template 2) 107.1214
Fig. 8. Histograms of different test images before and after message embedding using different encoding schemes as respectively with test images i) Lena.png, ii) Baboon.png
and iii) Pepper.png respectively.
S. Mahato et al. / Journal of King Saud University – Computer and Information Sciences 32 (2020) 197–207 207