0% found this document useful (0 votes)
31 views

Secure Arthimitic Coding

This is the paper for secure arithatic coding.

Uploaded by

Imran Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Secure Arthimitic Coding

This is the paper for secure arithatic coding.

Uploaded by

Imran Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Hindawi Publishing Corporation

EURASIP Journal on Information Security


Volume 2010, Article ID 621521, 9 pages
doi:10.1155/2010/621521

Research Article
Secure Arithmetic Coding with Error Detection Capability

Mahnaz Sinaie and Vahid Tabataba Vakili


Department of Electrical Engineering, Iran University of Science and Technology, Narmak, Tehran 1684613114, Iran

Correspondence should be addressed to Mahnaz Sinaie, [email protected]

Received 9 February 2010; Revised 23 May 2010; Accepted 7 September 2010

Academic Editor: Enrico Magli

Copyright © 2010 M. Sinaie and V. T. Vakili. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.

Recently, arithmetic coding has attracted the attention of many scholars because of its high compression capability. Accordingly,
this paper proposed a Joint Source-Cryptographic-Channel Coding (JSCC) based on Arithmetic Coding (AC). For this purpose,
embedded error detection arithmetic coding, which is known as continuous error detection (CED), is used. In our proposed
method, a random length of forbidden symbol which is produced with a key is used in each recursion. The dummy symbol is
divided into two dummy symbols with a key and then is placed in random positions in order to provide security. Finally, in
addition to producing secure codes, the suggested method reduced the added redundancy to half of the total redundancy added by
CED. It has less complexity than cascades source, channel coding, and encryption while its key space in comparison to other joint
methods has enlarged. Moreover, the coder provides a flexible switch between a standard compression model and a joint model.

1. Introduction Many data compression techniques are available for


efficient source coding [5, 6]. Strong error control codes
The increasing demand for the use of computer networks, have been developed for channel coding. In addition, some
the wide availability of digital multimedia contents, and the encryption algorithms have been developed for secure data
accelerated growth of wired and wireless communications transmission. Recent source coding, channel coding, and
have resulted in new research areas in joint coders. encryption algorithms require computational power for
The design of modern multimedia communication sys- encoding and decoding. This is particularly unfavorable
tems is very challenging as the system must satisfy several in certain applications such as mobile communications,
contrasting requirements [1]. Data compression is needed embedded systems and real-time communication, where
because it provides a mechanism to increase the effective devices (e.g., portable equipments) are resource constrained
bandwidth in a network and serves the highest possible due to the size limitation and power consumption consider-
number of users. Data compression optimizes the required ations [2].
storage space and reduces transmission time in the network. In real-time or satellite communication, delay and com-
In one hand, compression typically makes the transmission plexity are not desirable. Therefore, low complexity JSCC is
very sensitive to error or packet losses, thus it can decrease preferable for such situations. Techniques for joint source-
the quality of received data by the final users so channel channel coding, which have been proposed in this research,
coding is required for error detection and correction [2]. use the duality of source encoding and channel decoding
On the other hand, source coding decreases redundancy and are aimed at decoding noisy compressed data as reliably
in the plaintext which makes the data more resistant to as possible. The development of these joint algorithms has
statistical methods of cryptanalysis [3], and additionally, the closely followed the development of source and channel
accessibility of data makes it possible for the unauthorized coding algorithms.
users to reach the data easily. Therefore, to be reliably Most of the early works on joint source-channel coding
and confidentially transmitted, the data must be encrypted used different forms of Huffman codes. But nowadays, by
[4]. increasing interest in arithmetic coding in the multimedia
2 EURASIP Journal on Information Security

applications [7], for example, JPEG2000 and H.264, many 1 0.375 0.281 0.281
researchers were attracted to it. In 1997 Boyd et al. [8]
introduced a forbidden symbol in the source alphabet and c c c c
used it at the decoder side as an error detection device.
Sayir [9] considered the arithmetic coder as a channel coder
and added redundancy in the transmitted bit stream by b b b b
introducing gaps in the coding space and shrinking the
probability of symbols by a factor [10]. In these joint coders,
we have embedded error detection compressed data without a a a a
providing essentially any security in the face of a chosen
plaintext attack, in which an attacker has the ability to specify 0 0 0.141 0.246
a sequence of input symbols, to observe the corresponding Before encoding After After After
output, and to repeat this process for an arbitrary number of any symbol encoding a encoding b encoding c
times. Figure 1: An example of arithmetic coding, the source symbols are
Some schemes of joint AC and encryption have been also a, b, c with p(a) = 0.011, p(b) = 0.011, p(c) = 0.010 [10].
proposed up to now. Wen et al. [11] modified the traditional
AC by removing the constraint that intervals corresponding
to each symbol are continuous and the intervals associated
with each symbol can be split according to a key which [14] as CED. Until AC was developed in the 1970s, Huffman
is known for the both encoder and decoder. Grangetto et coding was considered to be almost optimal. Huffman
al. [12] proposed a method in which the system modified coding uses a tree for encoding a sequence. AC uses a
the traditional arithmetic coder by randomly permuting the one-dimensional table of probabilities instead of a tree. It
intervals in accordance with a key. always encodes the whole massage at once and allows the
Magli et al. [1] developed a JSCC. It used arithmetic cod- allocation of fractional number of bits to each source symbol.
ing which was proposed by Sayir and for providing security; It generates a code sequence which is uniquely decodable,
it randomly permuted the intervals in accordance with a such that the probability of distribution of code sequence
key generating shuffling sequence which was introduced by approaches the uniform distribution over the code alphabet
Grangetto. Although this system is a JSCC but the attacker [6].
can break the system by comparing N pairs of the output AC works by recursively subdivision of coding interval
with the corresponding input which differ from each other in portion to probabilistic estimates of symbols as generated
in exactly one symbol. Teekaput and Chokchaitam [13] have by a given model and retains it to be used as the new
introduced a scheme for JSCC. Security was provided by interval for the next encoding step of the recursion [5].
changing the location of the forbidden symbol. This system This can be illustrated better with an example. Consider a
looks like the system which was introduced by Magli et al., so source alphabet with three symbols [14] a, b, and c with
it suffered from the same limitations. p(a) = 0.375, p(b) = 0.375, and p(c) = 0.25. For example,
In this paper, we present a method for joint source- we want to encode the sequence abc. After encoding a, the
cryptographic-channel coding based on arithmetic coding. new interval will be [0, 0.375), and the transmitted sequence
This is very important in light of simplifying the design of the would lie in this interval. The next symbol is b, and according
system. We use binary arithmetic coding with the forbidden to the intervals associated to each symbol, the next interval
symbols which was introduced in [14] for error detection. will be [0.141, 0.281). This recursion continues to the end of
Security is provided by using random length of the forbidden the sequence. At the end, a number in the last interval which
symbols and randomly placing these dummy symbols in is a fractional number between zero and one will be sent as
the probability table. Compression ratio is improved in the sequence code. This example is illustrated in Figure 1.
comparison with the systems in [1, 13]. Also, the actual key AC is a powerful source coding technique and has higher
space has enlarged. This method can be used for arithmetic compression efficiency than other entropy coders. But arith-
coding with multiple symbols. However, to simplify the metic coding has two major drawbacks, the error sensitivity
method, we use binary AC. and error propagation property. Error propagation because
The rest of this paper is organized as follows: in Section 2, of loss of synchronization can damage the whole data after
we discuss more on arithmetic coding and arithmetic coding an error has occurred in the compressed data. We can use
with forbidden symbol. In Section 3, our proposed method this loss of synchronization to error detection. Anand et al. in
for JSCC is described. In Section 4, the results obtained [14] introduced a forbidden symbol which does not belong
from the simulation and the performance of the system are to the source alphabet and never occurs in the probability
explained. In Section 5, we draw some conclusions. table. For inserting dummy symbol into the probability table,
probability of the symbols should be shrunk by a factor.
This forbidden symbol has a finite and small probability
2. Arithmetic Coding and CED assigned to it. If its probability is ε, so the probability
of the symbols must be shrunk by factor (1 − ε). The
This section provides a brief introduction to arithmetic introduction of the forbidden symbol produces an amount
coding and AC with forbidden symbol which is named in of artificial coding redundancy per encoded bit equal to
EURASIP Journal on Information Security 3

a1 a2 a3
Original probability space

Reserving a a1 a2 a3 f1
probability space ε

1-ε ε

Arithmetic encoding a1 a1 a1 a2 a1 a3 a1 f1 a2 a1 a2 a2 a 2 a3 a2 f 1 a3 a1 a3 a2 a3 a3 a3 f1
f1 X
with a forbidden symbol

Reserved code space

a2 a 1 a1
a2 a1 a2

a2 a1 a3

a2 a2 a1
a2 a2 a2
a2 a2 a3

a2 a3 a1
a2 a3 a2
a2 a3 a3
a2 a1 f 1

a2 a2 f 1

a2 a3 f 1

a3 a1 a1
a3 a1 a2
a3 a1 a3

a3 a2 a1
a3 a2 a2
a3 a2 a3

a3 a3 a1
a3 a3 a2
a3 a3 a3
a3 a1 f1

a3 a2 f 1

a3 a3 f 1
a1 a1 a1
a1 a1 a2
a1 a1 a3

a1 a2 a1
a1 a2 a2
a1 a2 a3

a1 a3 a1
a1 a3 a2
a1 a3 a3

a1 f1 X

a2 f 1 X

a3 f 1 X
a1 a1 f 1

a1 a2 f 1

a1 a3 f 1 f1 XX
Valid code space

Figure 2: Encoding with a forbidden symbol for probability ε.

−log2 (1 − ε), at the expense of the compression efficiency The combined data encryption and AC use the error prop-
[14]. The decoder obtains an error detection capability and agation property of AC to provide security. Our proposed
enhances its robustness against noise. If an error occurs, this technique uses forbidden symbols with random lengths and
forbidden symbol is very likely to be eventually decoded with places them in random locations. The flowchart of this
a high probability. Figure 2 illustrates a sample of binary AC proposed technique is shown in Figure 3. While the concept
subinterval separation by inserting a forbidden symbol in the of this scheme can be applied to a source alphabet with any
current interval. size, for simplicity, the remainder of the discussion focuses
This forbidden symbol can be placed anywhere in the on the binary case.
probability table, and we can also have more than one
forbidden symbol and place them in more than one location
3.1. Inserting Forbidden Symbols. In conventional CED, the
in the probability table. In conventional CED, the probability
probability of the forbidden symbol is fixed and at the
of the forbidden symbol is fixed, and also the forbidden
beginning of the encoding process; this probability which
symbol is fixed at the same location for the whole encoding
is named ε is determined by (1). This depends on the
process. Before transmission, the encoder and decoder
maximum bit rate, R and the entropy of source, H(A) [9]:
should negotiate the location and size of the forbidden
symbol [13]. If its probability is fixed for the whole encoding  (1/R−1)H(A)
process, then the bit rate of the code is fixed, and the amount 1
ε =1− . (1)
of the added redundancy is fixed to −log2 (1 − ε) bit per 2
symbol.
If we take the maximum bit rate needed into account and Adding the forbidden symbol leads to the addition of
also consider that the bit rate in each recursion is not allowed redundancy to the output extension which can be used as a
to exceed the maximum bit rate, we can change the bit rate means of error detection. This method does not have enough
while encoding. This causes less redundancy to be added to security against attacks; therefore, we use a random-length
the bit stream and higher security. We describe this in more forbidden symbol in each recursion in our scheme instead
details in Section 3. of a fixed-length one. In each recursion with a random
generator, we generate a forbidden symbol in the range
[0, ε), in which ε is determined by maximum bit rate, R, by
3. Scheme of the Proposed Model using (1). The generated probability of the forbidden symbol
in each recursion is named γ. By using this random forbidden
The present paper aims to provide an arithmetic coding symbol in every recursion, we shrink the probability of
system which is secure and has an error detection capability. symbols by the factor (1 − γ). This causes adding random
Our scheme is based on CED, in which there is a forbidden redundancy while encoding each input symbol. In addition,
region with a probability ε, added to the probability table to we can claim that we have a semiadaptive arithmetic coder
provide some redundancy while a synchronized decoder can because in each recursion, a different length of the forbidden
detect the error occurring and conceal wrong decision bits. symbol is produced. It leads to a different shrinking factor
4 EURASIP Journal on Information Security

Start Table 1: Mapping function of binary arithmetic codes with two


different lengths of forbidden symbols (look up table).

Situations Situations
Initialize PRNGs
(a b µ1 µ2 ) (μ1 μ2 a b )
(a µ1 b µ2 ) (μ2 a μ1 b)
New input symbol (µ1 a b µ2 ) (a μ2 b μ1 )
(µ1 a µ2 b) (μ2 a b μ1 )

Yes End of Table 2: Mapping function of binary arithmetic codes with two
Stop
file equal lengths of forbidden symbol (look up table).

Situations
No (a b µ1 µ2 )
Produce μ2 , μ1 by random generator (a µ1 b µ2 )
(µ1 a b µ2 )
(µ1 a µ2 b)
Establishing look up
tables 1 or 2

probability table. We use Pseudorandom Number Generator


Determining probability table for PRNG (PRNG) to control the place of the forbidden symbols. A
encoding current symbol
seed value, S, which also represents another encryption key,
is used to initialize the PRNG. The bits of the generated
random sequence are used as an encryption key in each
Perform the arithmetic coding
recursion. In practice, the random sequence is taken on
the values 0 and 1 with probability of .5 which is also the
controlling bits sequence.
Send encoded bits If we divide forbidden symbol μ unequally, μ1 and μ2
are in different ranges so a binary memoryless source, X,
Figure 3: Flowchart of the proposed scheme. with probabilities P0 and P1 is encoded by means of a
quadruplet AC with the alphabets a, b, μ1 , μ2 . For allocating
these, we have different possibilities which are demonstrated
in Table 1. In this situation, for encoding each input symbol,
in each recursion. Therefore, the probability of the source
we use 3 bits of the generated random sequence of PRNG as a
symbols with various factors would be shrunk.
key to control the locations of the forbidden symbols. But, if
In the previous section we said that we can have more
we divide μ equally, we will have ternary AC, and Table 2 is its
than one forbidden symbol, therefore we use two forbidden
look up table. This look up table uses 2 bits for determining
symbols in this method. Since the sum of the probabilities
the locations of the forbidden symbols.
of two forbidden symbols must be equal to γ, we can divide
To conclude, we do not encrypt the code string which
the generated forbidden symbol, μ, in each recursion equally,
causes a totally different value but only secretly add subin-
or generate another forbidden symbol in the range [0, γ)
tervals and secretly place them. The proposed encoder
and then uniformly divide the forbidden symbol to two
works with a key K = (γ1 , γ2 , S), which represents the final
forbidden symbols μ1 , μ2 with probabilities of γ1 and γ2 .
encryption key. Given the same K, both the encoder and the
The γ1 and γ2 represent the encryption key which is
decoder generate the same pseudorandom number sequence
also referred to as K in the following sections and adjusted
for decision bits and exactly add the same γ1 , γ2 to the
with a proper precision in an acceptable range depending
corresponding code string in order to synchronize them with
on the requirements of different applications. At the decoder
each other. On the other hand, no matter which parameter
side, if a synchronized decoder is applied, that is, adding the
of K is unknown or incorrectly given, the decoder cannot
γ1 and γ2 at each coding step, data will be reconstructed
decode the compressed data properly, and the decompressed
accurately. Otherwise, whether using a standard AC decoder
data is almost meaningless. Furthermore, as long as γ1 and
or a decoder of proposed scheme with a different γ1 and γ2 ,
γ2 are set to 0, our scheme achieves a simple switch from the
the encoded code stream cannot be correctly decoded.
joint compression, error detection, and encryption model
to a standard compression model. Also by setting the sum
3.2. Establishing and Selecting the Probability Table. In of γ1 and γ2 equal to ε, this JSCC is transformed to joint
Section 2 we demonstrated that the forbidden symbol can be compression and error detection. Thus, this scheme can be
placed anywhere in the probability table. In binary AC, it can used for selective encryption and apply to portions of data
be placed at the beginning, in the middle and at the end of the which needs more security. Nevertheless, an efficient and
EURASIP Journal on Information Security 5

secure key distribution protocol is one of the challenging Based on the extensive performed simulations, it is
issues and is beyond the scope of this paper. concluded that in the CED method, if n bits are needed
for detecting an error after it has occurred, (6 ∗ n)/5 bits
are needed for error detection in our proposed method.
4. Simulation Results Hence, to solve this problem, we can compensate for this
Our proposed scheme has been implemented with Matlab shortcoming by assuming greater lengths for the input blocks
software and a personal computer with 2 G of RAM and in the proposed encoder. However, we know that adding
Intel Centrino Core 2 Duo 2.2 G as its CPU. Due to security and error detection capability to a compression
unstable possesses in computer systems, we take 20 trials encoder often leads to a compromise between the amounts
and select the most frequently occurred results as the final of compression achieved and the amount of security and the
values. Input symbols, upper and lower bounds, and also robustness against channel errors incorporated.
produced forbidden symbols in each recursion are set with The encoded stream can be reconstructed perfectly
precisions of 10−6 being equal to 16-bit implementation. It by providing the same K and by reversing the encoding
is worth noting that this precision is not fixed and can be operations. By having the same K, both encoder and decoder
flexibly adjusted depending on the requirement of the target generate the same pseudorandom number sequence for
applications. decision bits and exactly add the same γ1 and γ2 to the corre-
sponding code string in order to synchronize with each other.
As soon as the forbidden symbol is decoded, the occurrence
4.1. Compression Ratio. The Joint Source-Cryptographic- of error in the received sequence is detected. However, this
Channel Model should be used with the precondition that method of decoding is not capable of correcting the errors.
there is no large redundancy generation after modifying the But, the redundancy of the encoder’s output can be used for
standard coding engine. Table 3 shows the results of applying correcting errors.
the proposed method for input sequences with lengths of Arithmetic codes can be viewed as tree codes. Sequential
100, 1000, and 10000 symbols and allows for comparison decoding is a general decoding algorithm for tree codes. It
with traditional arithmetic coding in absolute as well as was introduced by Wozencraft and Reiffen to decode con-
relative terms. The upper half of the table considers the volutional codes in [15]. Fano [16] presented an improved
case where p(a) = 1/3, and the lower half of the table sequential algorithm in 1963, which is now known as the
considers the case where p(b) = 5/6. The exact length of Fano algorithm. Pettijohn et al. [17, 18] proposed two
the output depends not only on the input data but also sequential decoding algorithms, depth first and breadth first,
on the specific sequence of forbidden symbols, as well as for decoding arithmetic codes in the presence of channel
their locations and lengths in each recursion. Therefore, errors. We can use these decoding algorithms with the same
the code lengths shown in the table are averages based on key for decoding the output of our proposed scheme.
simulations using 1000 random sequence realizations. The
column labeled “proposed method” gives the mean of the
4.2. Complexity. Sayir [10] showed that an arithmetic coder
code lengths based on a large number of simulations using
can be an entropy source encoder when the model is
random seeds for location and lengths. These results show
matched with the source and can be a channel encoder
that in order to limit coding redundancy, ε should be defined
when the probability space is properly reserved for error
in a limited range, which can be flexibly controlled according
protection and can act as a convolutional code. After
to the requirement of various application systems. Table 3
inserting the forbidden symbol to a source with M alphabet,
shows that the redundancy added to the bit stream by our
we will have an arithmetic coding with M + 1 alphabet in
model is half of the redundancy which conventional method
which one of the symbols never appears. Therefore, adding
adds to the bit stream. For ε = 0.03, redundancy is 0.0439 bit
parity is performed while compression without adding more
per symbol. If the length of ε is fixed, for example, N = 100,
additional operations to the conventional arithmetic coding.
the redundancy which is added is 4.39 bit per symbol, but
If the source has M alphabet, so this method just adds M
our proposed method adds 2.1 bit per symbol.
multiplication and 1 additional operation to the complexity
Using the forbidden symbol in the source alphabet
of conventional arithmetic encoder. But if we want to place
actually aims at simply detecting errors and not correcting
a convolutional encoder after arithmetic encoder, according
them. By randomizing the forbidden symbol, although the
to the amount of redundancy, it needs some shift and XOR
amount of the added redundancy is reduced to half, this does
operations and increasing memory usage. For example, if the
not interfere with the capability of error detection. However,
bit rate is 1/2 and the code generator polynomial is p(x) =
Anand et al. [14] gave an empirical model to estimate the
x2 + x, it would need at least three shift register and XOR
number of the bits necessary to detect an error after it has
operations for each input symbol.
occurred. This is shown in the following:
Also because a traditional arithmetic coder needs to work
sequentially, arithmetic coding and convolutional coding
P y (k) = (1 − ε)k−1 ε, k = 1, 2, . . . , ∞. (2) cannot be parallelized. A comparison of time duration for
arithmetic coding and arithmetic coding followed by a 1/2
The probability of not detecting an error after n bits is feedforward convolutional encoder is shown in Figure 4.
Placing the forbidden symbol in different locations
P(n) = (1 − ε)n . (3) and assigning random lengths of the forbidden symbols
6 EURASIP Journal on Information Security

Table 3: Comparison of code lengths as a function of sequence length N.

Proposed method
AC with fixed length
Symbol probability N N ∗H AC with maximum
ε = 0.03
ε = 0.03
10 9.183 9.1890 9.7390 9.4470
P(a) = 1/3 H = 100 91.83 91.9100 96.3610 94.0330
entropy = .9183 1000 918.3 918.6810 962.7090 941.2350
10000 9183 9183.700 9622.410 9404.46
10 5.917 6.0410 6.4450 6.0100
P(a) = 5/6 H = 100 59.17 65.7390 69.6020 67.3620
entropy = .5917 1000 591.7 650.3070 694.9340 672.7320
10000 5917 6500.250 6940.01 6717.720

250 300

250
200

Time duration (s)


Time duration (s)

200
150
150
100
100

50
50

0 0
8 10 15 30 40 60 80 90 100 8 10 15 30 40 60 80 90 100
File size in bytes ×103 File size in bytes ×103

Disjoint source, channel coder Proposed method


Joint source, channel coder Cascaded AC, AES and convolutional coder

Figure 4: Comparison of AC with forbidden symbol and cascaded Figure 5: Comparison of proposed system and cascaded arithmetic
arithmetic coding with convolutional coder. coding with AES and convolutional coder.

increase computational complexity. This extracomputational adds as much as 128 × 2 × 2 operations to conventional
complexity of joint AC and channel coding in comparison arithmetic coding operations. This number is much smaller
with the complexity of three disjoint coders is very small. than the number of operations added to the AC with disjoint
It is relevant to consider a system consisting of a coders. Figure 5 compares time duration required by binary
traditional arithmetic encoder followed by AES, which, of arithmetic coding for p(a) = 1/3 followed by AES with a
course, would also deliver security and compression. Since block size of N = 128 and 1/2 feedforward convolutional
AES was designed for efficient hardware implementation, it encoder with our proposed method. We can see that our
is extremely fast when it is fully pipelined in hardware [19]. system takes much shorter time than a cascaded system.
However, because a traditional arithmetic coder needs to Our proposed technique can be implemented utilizing
work sequentially, the AC cannot easily be parallelized and techniques similar to those used in traditional arithmetic
becomes a bottleneck in a combined AC/AES system [7]. coding and can benefit from the same optimizations for
AES consists of 40 sequential transformation steps composed speed, finite precision, and so forth. Inserting the forbidden
of simple and basic operations such as table lookups, shifts, symbol to the probability table adds no complexity to
and XORs. For a block size of 128, these steps require a arithmetic coder; only establishing the probability table and
total number of 19 shifts, use of 336 bytes of memory, and searching the look up table increase the amount of memory
the XORing of approximately (the exact requirement is data needed to store the look up table and the probability of
dependent) 608 bytes of data. But, our proposed technique forbidden symbols. In addition, division of the forbidden
adds a maximum of 20 bytes of memory, no XOR, and no symbol and updating the probability of symbols by factor
shift operations to conventional AC. For a block size of 128 (1 − γ) in each recursion introduce an additional multipli-
and a source with binary alphabets, our proposed method cation though, as with traditional arithmetic coding, faster
EURASIP Journal on Information Security 7

algorithms that replace the multiplications with simpler 4.3.3. Sensitivity Analysis. An ideal procedure of data encryp-
operations can be introduced [20]. tion should be sensitive to both the secret key and the
plaintext. The change of a single bit in either the secret
key or the plaintext should produce a completely different
4.3. Security Analysis. A good encryption procedure should
encrypted data. To prove the robustness of the proposed
be robust against all kinds of cryptanalytic, statistical, and
scheme, we performed sensitivity analysis with respect to
brute-force attacks. In this section, we discuss the security
both the secret key and the plaintext.
analyses of the proposed encryption scheme. This includes
statistical analysis, key space analysis, and sensitivity analysis
of the proposed encryption scheme with respect to the key (A) Sensitivity Analysis of the Cipher to Key. For testing the
and plaintext, and so forth. to prove that the proposed key sensitivity of the proposed coder, we performed the
cryptosystem is secure against the most common attacks. following steps:

(a) changing one bit of S1 which determined the forbid-


4.3.1. Key Space. For a secure encryption algorithm, the key den symbol length in each recursion,
space should be large enough to make the brute force attack
infeasible. The main private information in our proposed (b) changing one bit of S2 which divided the forbidden
scheme is the key used in the PRNGs; each of them is as long symbol into two different forbidden symbols in each
as 128 bits. These PRNGs generate random sequences which recursion,
are used by the proposed technique as a secret key in each (c) changing one bit of S3 which determined the proba-
recursion. bility table in each recursion,
The proposed cipher has 2128×3 different combinations of
(d) changing just one bit of the three main keys.
the secret key, and key space of our proposed method is larger
than that of the methods introduced in [1, 12]. A cipher with It is not easy to compare the encrypted outputs by simply
such a long key space is sufficient for reliable practical use in observing them. Thus, for the comparison, we calculated
multimedia communications. the correlation between the corresponding bits of the four
As mentioned above, the proposed encoder uses gener- encrypted data by (4) [22]:
ated random sequences as its secret key in each recursion.
In [13] there are only two possible choices in one recursion: Cr
at the beginning of the probability table or at the end. N   N N
Even though the swapping probability is also used as a key N j =1 xj × yj − j =1 x j × j =1 yj ,
parameter in this method, but there are other keys, γ1 and γ2 , = 
  2    2
and attacker must decode received sequence using all possible N Nj=1 x2j − Nj=1 x j × N Nj=1 y 2j − Nj=1 y j
seeds, S, or γ1 and γ2 for accessing correct data.
(4)
If precision of γ is set to 16 bits, one should try 232 trails
for estimating each forbidden symbol in one recursion and 23
where, x j and y j are the values of corresponding bits in the
trails for finding the situation of the probability table in each
two encrypted outputs to be compared and N is the total
recursion; therefore, the actual key space in each recursion
number of output bits.
can be 234 times larger than the key space in [13]. In this
We performed the above mentioned steps for several
proposed method, if we suppose that the key S is known by
different keys. Then, we calculated the correlation coefficient
the attacker, he cannot find out what random value at which
for the encoded sequences by using (4). In all the cases, very
positions is added, and as long as the attacker is not aware
small correlation coefficients of the corresponding outputs
of the value of the forbidden symbols he cannot access the
were obtained. For instance, Table 5 shows the correlation
status of the probability table in each recursion.
coefficients between encoded sequences with S1, S2, and S3
keys for the outputs from the steps (a) to (d) based on
4.3.2. NIST SP 800-22 Test for Cipher. In this study, NIST SP changing the first bits of the keys.
800-22 [21] tests are used for testing the randomness of the As the Table 5 shows, no correlation exists among
cipher. The NIST Test Suite is a statistical package consisting the three encrypted outputs even though these have been
of 16 tests that were developed to test the randomness produced by using only slightly different secret keys. Also,
of binary sequences, with arbitrary lengths, produced by based on the comparison of outputs of the proposed scheme
either hardware or software-based cryptographic random or for a large number of inputs, it was found that changing one
Pseudorandom Number Generators. These tests focus on symbol in the plaintext will result in a completely different
a variety of different types of nonrandomness that could output by more than 99%. This shows that different inputs
exist in a sequence. Hence, in this test the cipher sequence, even in one symbol will result in different outputs.
whose length is 106 , is examined. The results of testing the It can be also concluded from this table that all the
randomness of the cipher are shown in Table 4. We can ambiguities of the proposed coder are independent from
conclude from Table 4 that the cipher which was encrypted each other. Therefore, even if the attacker finds access to one
from this encoder is stochastic and it has robustness against of the keys, no information about the other keys is released
known cipher-text attack. by that one.
8 EURASIP Journal on Information Security

Table 4: Sp 800-22 tests results of cipher. Table 5: Correlation coefficients of different outputs.

Statistical test P-value Results Output Correlation coefficient


Monobit 1.1 success Changing first bit of S1 0.0228
Block frequency (m = 128) .2251 success Changing first bit of S2 0.0170
runs .8513 success Changing first bit of S3 −0.0045
Rank .2335 success Changing first bit of S1, S2, S3 −0.0350
Spectral DFT .8513 success
Nonoverlapping templates
(M = 1032, .1799 success
Since the proposed coder is simulated for binary inputs
B = 110101010)
and the output is also binary, we can calculate the changing
Overlapping templates bit rates of the cipher instead of correlation coefficients.
(m = 9, M = 933, 1 success
Change of one bit in the plaintext should make theoretically
B = 110101010)
a 50% difference [22] in the bits of the cipher. We also
P-value 1 .9640 Success
Serial developed a test for the changing rate of the cipher bits. The
P-value 2 .8729 success changing rate was 49.41%. For all these reasons, the proposed
Forward .7573 Success scheme of this study proves to be sensitive to the changes in
Cumulative sums
reverse .6686 success the input, hence, an ideal coder.
X = −4 .9343 Success
X = −3 .8757 Success 4.3.4. Different Attacks. According to both the above analyses
X = −2 .9008 Success and the following reasons, the proposed algorithm is resis-
Random excursions (state X = −1 .8787 Success tant to the chosen plaintext attacks.
x) X =1 .2435 Success (i) The model dynamically reorders the frequency of the
X =2 .8922 Success input symbols according to the length of random
X =3 .6260 Success forbidden symbols in each recursion.
X =4 .8816 success (ii) The output from the engine is in the form of words
X = −9 .8319 Success with variable sizes so the individual bits of the output
X = −8 .9067 Success corresponding to the inserted symbols could not be
X = −7 .8503 Success determined.
X = −6 .9922 Success
The entropy, H(S), of a message source, S, can be
X = −5 .6578 Success calculated by (5)
X = −4 .6592 Success
 1
X = −3 .9653 Success H(S) = P(si )log2 bits, (5)
X = −2 .7504 Success si P(si )
Random excursions variant
X = −1 .6973 Success
(state x) where P(si ) represents the probability of symbol si . The
X =1 .6269 success
entropy is expressed in bits. If the source emits 2 symbols
X =2 .7790 success
with equal probability, that is, S = {s1 , s2 }, then the entropy
X =3 .4957 success is H(S) = 1, corresponding to a true random sequence. The
X =4 .4850 success system test real entropy value is 0.9974. So the system can
X =5 .8543 success resist the entropy attacks.
X =6 .2173 success Another large class of attacks is based on the anal-
X =7 .6084 success ysis of statistical properties of the output bit stream B
=B1 B2 · · · BNc , where Nc is the output length. It is thus
X =8 .5193 success
important to investigate the statistics of B. Various simu-
X =9 .5294 success lations showed that the output of the proposed coder had
P(Bi = 0) = P(Bi = 1) = 1/2, for any i. Therefore, from the
first-order statistics, the attacker cannot find any information
regarding the secret key.
(B) Sensitivity Analysis of Cipher to Plaintext. Generally, Alternatively, the attacker may wish to recover the key
attacker may make a slight change in the plaintext. In order stream which is used in the proposed method. Suppose that
to test the influence of changing a single bit in the original the input symbol sequence length is N. The length of the key
data, the correlation coefficients between the corresponding stream used in the method is then LN = (2 × 16 × N) + 3 × N.
output sequences were calculated for the changes in the input Assume that the generated bit stream is of length N c . Then,
sequence. As expected, the correlation coefficients were very the total complexity of breaking the key stream is 2Nc +LN . In
small. the case that the input symbol sequence length is sufficiently
EURASIP Journal on Information Security 9

large which makes 2Nc +LN > 2128×3 , the attacker would rather [7] H. Kim, J. Wen, and J. D. Villasenor, “Secure arithmetic
use the brute-force attack to break the secret key utilized in coding,” IEEE Transactions on Signal Processing, vol. 55, no. 5,
the PRNGs. pp. 2263–2272, 2007.
A pseudorandom sequence is vulnerable to the known [8] C. Boyd, J. G. Cleary, S. A. Irvine, I. Rinsma-Melchert, and
plaintext attacks; since there is a given known input I. H. Witten, “Integrating error detection into arithmetic
sequence, the attacker can compare the joint source-channel coding,” IEEE Transactions on Communications, vol. 45, no. 1,
pp. 1–3, 1997.
coder and the proposed coded sequences and attempt to find
[9] J. Sayir, On Coding By Probability Transformation, Hartung-
the added subintervals and their locations. To increase the
Gorre, Konstanz, Germany, 1999.
security, an efficient key distribution protocol could be also
[10] J. Sayir, “Arithmetic coding for noisy channels,” in Proceedings
explored in our algorithm to provide a sufficient encryption. of the Information Theory and Communication Workshop, pp.
69–71, IEEE, 1999.
[11] J. G. Wen, H. Kim, and J. D. Villasenor, “Binary arithmetic
5. Conclusion coding with key-based interval splitting,” IEEE Signal Process-
In this paper, a scheme has been presented which combines ing Letters, vol. 13, no. 2, pp. 69–72, 2006.
compression, error detection, and data encryption. The [12] M. Grangetto, E. Magli, and G. Olmo, “Multimedia selective
encryption by means of randomized arithmetic coding,” IEEE
proposed technique by adding a little complexity to CED
Transactions on Multimedia, vol. 8, no. 5, Article ID 1703505,
provides security. It adds two random subinterval μ1 and μ2 pp. 905–917, 2006.
to the probability interval in each iterative coding step and [13] P. Teekaput and S. Chokchaitam, “Secure embedded error
controls the locations of the forbidden symbol by a PRNG detection arithmetic coding,” in Proceedings of the 3rd Interna-
with a seed, S, while the key is K = (S, γ1 , γ2 ) in each tional Conference on Information Technology and Applications
recursion. Moreover, it easily switches to standard arithmetic (ICITA ’05), pp. 568–571, IEEE, July 2005.
coding by setting γ1 and γ2 equal to zero when the data [14] R. Anand, K. Ramchandran, and I. V. Kozintsev, “Continuous
do not need to be protected. This coder causes the added error detection (CED) for reliable communication,” IEEE
redundancy to be almost halved without any special effect Transactions on Communications, vol. 49, no. 9, pp. 1540–
on error detection capability. The proposed technique is less 1549, 2001.
complicated and faster than cascaded systems; therefore, they [15] J. M. Wozencraft, and B. Reiffen, Sequential Decoding, MIT
are more suitable for real-time applications. The technique Press, Cambridge, Mass, USA, 1961.
can be also extended to selectively encrypting data and [16] R. M. Fano, “A heuristic discussion of probabilistic decoding,”
images. This proposed method can be used in ARQ systems IEEE Transactions Information Theory, pp. 64–74, 1963.
for error detection and error correction. [17] B. D. Pettijohn, K. Sayood, and M. W. Hoffman, “Joint
source/channel coding using arithmetic codes,” in Proceedings
of the Data Compression Conference (DDC ’00), pp. 73–82,
Acknowledgment Snowbird, Utah, USA, March 2000.
[18] B. D. Pettijohn, M. W. Hoffman, and K. Sayood, “Joint
The authors would like to thank ITRC (Iran Telecommu- source/channel coding using arithmetic codes,” IEEE Transac-
nication Research Center) for the invaluable assistance and tions on Communications, vol. 49, no. 5, pp. 826–835, 2001.
funding for this paper. [19] A. Hodjat and I. Verbauwhede, “Area-throughput trade-offs
for fully pipelined 30 to 70 Gbits/s AES processors,” IEEE
Transactions on Computers, vol. 55, no. 4, pp. 366–372, 2006.
References [20] M. Grangetto, E. Magli, and G. Olmo, “Multimedia selective
encryption by means of randomized arithmetic coding,” IEEE
[1] E. Magli, M. Grangetto, and G. Olmo, “Joint source, channel Transactions on Multimedia, vol. 8, no. 5, pp. 905–917, 2006.
coding, and secrecy,” EURASIP Journal on Information Secu- [21] A. Rukhin, J. Soto, J. Nechvatal et al., “A statistical test
rity, vol. 2007, Article ID 79048, 7 pages, 2007. suite for random and pseudorandom number generators for
[2] H. Kaneko and E. Fujiwara, “Joint source-cryptographic- cryptographic applications,” NIST Special Publication 800-22,
channel coding based on linear block codes,” in Applicable May 2001.
Algebra in Engineering, Communication and Computing, vol. [22] X. Tong, M. Cui, and Z. Wang, “A new feedback image encryp-
4851 of Lecture Notes in Computer Science, pp. 158–167, 2007. tion scheme based on perturbation with dynamical compound
[3] R. Bose and S. Pathak, “A novel compression and encryption chaotic sequence cipher generator,” Optics Communications,
scheme using variable model arithmetic coding and coupled vol. 282, no. 14, pp. 2722–2728, 2009.
chaotic system,” IEEE Transactions on Circuits and Systems, vol.
53, no. 4, pp. 848–857, 2006.
[4] D. Xie and C.-C. J. Kuo, “Multimedia encryption with
joint randomized entropy coding and rotation in partitioned
bitstream,” EURASIP Journal on Information Security, vol.
2007, Article ID 35262, 12 pages, 2007.
[5] A. Moffat, R. M. Neal, and I. H. Witten, “Arithmetic coding
revisited,” ACM Transactions on Information Systems, vol. 16,
no. 3, pp. 256–294, 1998.
[6] T. Cover and J. Thomas, Elements of Information Theory, John
Wiley & Sons, New York, NY, USA, 1991.

You might also like