Communications in Computer and Information Science 122
Communications in Computer and Information Science 122
Communications in Computer and Information Science 122
122
Security Technology,
Disaster Recovery
and Business Continuity
International Conferences, SecTech and DRBC 2010
Held as Part of the Future Generation
Information Technology Conference, FGIT 2010
Jeju Island, Korea, December 13-15, 2010
Proceedings
13
Volume Editors
Tai-hoon Kim
Hannam University, Daejeon, South Korea
E-mail: taihoonn@hnu.kr
Wai-chi Fang
National Chiao Tung University, Hsinchu, Taiwan
E-mail: wfang@mail.nctu.edu.tw
Muhammad Khurram Khan
King Saud University, Riyadh, Saudi Arabia
E-mail: mkhurram@ksu.edu.sa
Kirk P. Arnett
Mississippi State University, Mississippi State, MS, USA
E-mail: kpa1@msstate.edu
Heau-jo Kang
Mokwon University, Daejeon, South Korea
E-mail: hjkang@mokwon.ac.kr
zak
Dominik Sle
University of Warsaw & Infobright, Poland
E-mail: dominik.slezak@infobright.com
1865-0929
3-642-17609-7 Springer Berlin Heidelberg New York
978-3-642-17609-8 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
springer.com
Springer-Verlag Berlin Heidelberg 2010
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper
06/3180
Preface
Welcome to the proceedings of the 2010 International Conferences on Security Technology (SecTech 2010), and Disaster Recovery and Business Continuity (DRBC
2010) two of the partnering events of the Second International Mega-Conference on
Future Generation Information Technology (FGIT 2010).
SecTech and DRBC bring together researchers from academia and industry as well
as practitioners to share ideas, problems and solutions relating to the multifaceted
aspects of security and disaster recovery methodologies, including their links to computational sciences, mathematics and information technology.
In total, 1,630 papers were submitted to FGIT 2010 from 30 countries, which includes 250 papers submitted to SecTech/DRBC 2010. The submitted papers went
through a rigorous reviewing process: 395 of the 1,630 papers were accepted for
FGIT 2010, while 57 papers were accepted for SecTech/DRBC 2010. Of the 250
papers 10 were selected for the special FGIT 2010 volume published by Springer in
the LNCS series. 34 papers are published in this volume, and 13 papers were withdrawn due to technical reasons.
We would like to acknowledge the great effort of the SecTech/DRBC 2010 International Advisory Boards and members of the International Program Committees, as
well as all the organizations and individuals who supported the idea of publishing this
volume of proceedings, including SERSC and Springer. Also, the success of these
two conferences would not have been possible without the huge support from our
sponsors and the work of the Chairs and Organizing Committee.
We are grateful to the following keynote speakers who kindly accepted our invitation: Hojjat Adeli (Ohio State University), Ruay-Shiung Chang (National Dong Hwa
University), and Andrzej Skowron (University of Warsaw). We would also like to
thank all plenary and tutorial speakers for their valuable contributions.
We would like to express our greatest gratitude to the authors and reviewers of all
paper submissions, as well as to all attendees, for their input and participation.
Last but not least, we give special thanks to Rosslin John Robles and Maricel Balitanas. These graduate school students of Hannam University contributed to the editing
process of this volume with great passion.
December 2010
Tai-hoon Kim
Wai-chi Fang
Muhammad Khurram Khan
Kirk P. Arnett
Heau-jo Kang
Dominik lzak
Organizing Committee
General Chair
VIII
Program Committee
A. Hamou-Lhadj
Ahmet Koltuksuz
Albert Levi
A.L. Sandoval-Orozco
ByungRae Cha
Ch. Chantrapornchai
Costas Lambrinoudakis
Dieter Gollmann
E. Konstantinou
Eduardo B. Fernandez
Fangguo Zhang
Filip Orsag
Gerald Schaefer
Han-Chieh Chao
Hiroaki Kikuchi
Hironori Washizaki
Hongji Yang
Hsiang-Cheh Huang
Hyun-Sung Kim
J.H. Abbawajy
Jan deMeer
Robert Seacord
Rodrigo Mello
Rolf Oppliger
Rui Zhang
S.K. Barai
Serge Chaumette
Sheng-Wei Chen
Silvia Abrahao
Stan Kurkovsky
Stefanos Gritzalis
Swee-Huay Heng
Tony Shan
Wen-Shenq Juang
Willy Susilo
Yannis Stamatiou
Yi Mu
Yijun Yu
Yingjiu Li
Yong Man Ro
Yoshiaki Hori
Young Ik Eom
Organizing Committee
General Chair Heau-jo Kang (Mokwon University, Korea)
Program Co-chairs
Tai-hoon Kim (Hannam University, Korea)
Byeong-Ho Kang (University of Tasmania, Australia)
Publicity Co-chairs
Martin Drahansky (Brno University of Technology, Czech Republic)
Aboul Ella Hassanien (Cairo University, Egypt)
Publication Co-chair
Rosslin John Robles (Hannam University, Korea)
Maricel Balitanas (Hannam University, Korea)
International Advisory Board
Wai-chi Fang (National Chiao Tung University, Taiwan)
Young-whan Jeong (Korea Business Continuity Planning Society,
Korea)
Adrian Stoica (NASA Jet Propulsion Laboratory, USA)
Samir Kumar Bandyopadhyay (University of Calcutta, India)
Program Committee
Fabrizio Baiardi
Erol Gelenbe
Rdiger Klein
Simin Nadjm-Tehrani
Simin Nadjm-Tehrani
Sokratis Katsikas
Teodora Ivanusa
Sandro Bologna
Snjezana Knezic
Emiliano Casalicchio
Stefan Brem
Stefan Wrobel
Table of Contents
18
29
39
47
57
68
74
84
94
XII
Table of Contents
104
114
126
134
142
149
161
171
179
187
197
207
220
Table of Contents
XIII
224
231
236
244
250
259
269
276
282
290
299
1 Introduction
Nowadays fingerprint verification system is the most widespread and accepted biometric technology. Furthermore, its market share is expected to grow even further [1]. Fingerprint recognition technology explores various features of the human fingers for this
purpose. Many fingerprint verification algorithms are based on a ridge feature called
minutiae which is ridge termination and ridge bifurcation [2]. In general, a number of
minutiae points in a finger can be correlated with a size of the finger. Every normal
person has 5 fingers with different size on each of his two hands. Although using all
ten fingers (at the same time) for verification can improve performance and robustness,
most of the current systems do not employ all ten fingers because of usability aspects,
if only a single-finger scanner is available, or the distracting higher cost of multi-finger
scanners. Indeed, the process of presenting all ten fingers separately for authentication
can be inconvenient. Thus, one needs to decide which fingers to use or not to use. It is
claimed that recognition performance with little fingers can be less accurate compared
to other finger types. However, to our best knowledge, this fact has not been confirmed
by research yet.
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 17, 2010.
c Springer-Verlag Berlin Heidelberg 2010
D. Gafurov et al.
This paper presents our investigation on the topic of influence of the finger type into
fingerprint recognition performance. We evaluate performance using ten fingers separately and then compare individual performances. For analysis we used two publicly
available fingerprint verification software packages (one is free and the other is for purchase). We conducted test on GUC100 multi sensor fingerprint database which contains
fingerprint images of all 10 fingers from 100 subjects. The rest of the paper is structured
as follow. Section 2 provides an overview of the fingerprint database and verification
software packages. Section 3 presents performance evaluation with respect to finger
types, and section 4 concludes the paper.
2 Data Set
In order to investigate the impact of the finger type into fingerprint verification system
we have chosen two different verification algorithms which are publicly available. The
free publicly available fingerprint verification program is NISTs mindtct and bozorth3
[3]. The second one is Neurotechnologys VeriFinger which is commercially available
at [4].
As a test database we used GUC100 fingerprint data set which consists of over 70000
fingerprint images from 100 subjects. For our study main advantages of this database
are following:
It has equal number of images from all 1000 fingers (except very few) which minimizes a bias. The number of images per finger per scanner is 12;
Each fingerprint image of the finger was collected on separate days (usually one
week in between) during several months which accounts for natural variability of
the fingers condition;
The used database was composed using five different fingerprint sensors, namely
L-1 DFR2100, Cross Match LSCAN100, Precise 250MC, Lumidigm V100 and
Sagem MorphoSmart.
More information on GUC100 fingerprint database and its availability to other researchers for testing can be found at [5].
EER
FNMR, %
EER
20
40
60
80
100
20
40
FMR, %
100
EER
6
FNMR, %
20
40
60
80
100
20
40
60
FMR, %
15
FMR, %
FNMR, %
10
EER
FNMR, %
80
EER
60
FMR, %
2
0
20
40
60
80
100
FMR, %
80
100
25
20
15
EER
10
15
EER
FNMR, %
20
25
10
FNMR, %
30
D. Gafurov et al.
30
20
40
60
80
100
20
40
100
20
25
EER
10
15
FNMR, %
25
20
20
40
60
80
100
20
40
20
25
15
EER
10
FNMR, %
60
FMR, %
30
FMR, %
FNMR, %
15
10
5
0
80
30
EER
60
FMR, %
30
FMR, %
20
40
60
80
100
FMR, %
80
100
From all figures it can be observed that two small fingers (5 and 10) have the highest
EER values. Likewise, in all of these plots it appears that best performance (i.e. smallest
EER) is with either thumb finger (1 or 6) or index finger (2 or 7) mostly of the right
hand. We define performance deterioration (in terms of EER) between fingers Fi and
Fj ((i, j) = {1, ..., 10}) according to formula below:
ij =
(EERFi EERFj )
100%
EERFj
where EERFi and EERFj are EERs of fingers Fi and Fj , respectively. Tables 1 and 2
show such decrease of the performances of small fingers (i.e. F5 and F10 ) with respect
to other fingers with Neurotechnology and NIST, respectively. As can be seen from Tables 1 and 2 deterioration performances of small fingers with respect to the best finger
performance can be in the range of 260%-1351.9% with Neurotechnology (minimum
and maximum of bolded numbers in Table 1), and 184.5%-461.9% with NIST (minimum and maximum of bolded numbers in Table 2).
Table 1. Relation of the small fingers to other fingers (Neurotechnology). Numbers are given in
%. Bold is with respect to the best finger.
Finger type
Finger 1
Finger 2
Finger 3
Finger 4
Finger 6
Finger 7
Finger 8
Finger 9
Scanner 1
Scanner 2
F5
F10
F5
F10
1118.2 1115.2 75.2 77.1
1082.4 1079.4 256.1 260
351.7 350.6 122.6 125
142.2 141.6 36
37.4
500
498.5 45.6 47.2
209.2 208.5 245 248.8
282.9 281.9 106 108.2
46.7 46.4 26.6 28
Scanner 3
F5
F10
525.9 385.2
397.1 285.3
445.2 322.6
133.6 81.1
267.4 184.8
360.9 257.3
193.1 127.2
70.7 32.3
Scanner 4
F5
F10
81.4 92.4
411.5 442.6
119.7 133.1
13
19.9
50
59.1
345.7 372.9
155.7 171.3
23.3 30.8
Scanner 5
F5
F10
407.6 1351.9
383.1 1281.9
163.8 654.6
60.4 358.8
140.1 586.8
188.5 725.2
131.8 563
10.8 216.9
It is worth mentioning that the influence of finger type into performance was noted
previously as well [7]. The significant difference of our work is that we focus on all 10
fingers while in [7] 8 fingers (small fingers are omitted) were considered. Furthermore,
we carried out our analysis using fingerprint images from various scanner technologies.
D. Gafurov et al.
Table 2. Relation of the small fingers to other fingers (NIST). Numbers are given in %. Bold is
with respect to the best finger.
Finger type
Finger 1
Finger 2
Finger 3
Finger 4
Finger 6
Finger 7
Finger 8
Finger 9
Scanner 1
F5
F10
324.5 366.8
299.6 339.5
144.2 168.5
104.2 124.6
401.5 451.5
301.2 341.2
158.3 184.1
36.8 50.4
Scanner 2
F5
F10
57
65.2
183.4 198.2
82.9 92.4
32.8 39.7
40.3 47.6
184.5 199.4
79.3 88.7
4.5
9.9
Scanner 3
F5
F10
292.5 274.6
199.7 186
130.7 120.2
103.1 93.8
203.2 189.3
215.9 201.5
158.5 146.7
30.8 24.9
Scanner 4
F5
F10
135.8 168.8
275.8 328.5
81.4 106.8
36.7 55.9
82.2 107.7
217.6 262
102.1 130.4
17.9 34.4
Scanner 5
F5
F10
180 329.1
266.7 461.9
120 237.1
62.8 149.5
132.8 256.8
236.9 416.2
106.5 216.5
36.3 108.8
The two factors that can influence finger performance are the finger size and ergonomics associated to its presentation. Low performance with little fingers can be
attributed to their relatively small sizes, and thumb fingers good performance can be
related to its larger surface area. Good performance accuracy with index fingers can
be related to its ergonomics. In other words we believe among finger types the index
fingers are easiest (least effort) to present to the sensor. This observation can be due
to the fact that index finger has commonly a well established muscular structure such
that presentation to a fingerprint sensor would provide a suitable pressure that is needed
especially on optical fingerprint sensors. Furthermore this finger has the greatest range
of extension/flexion [8]. The other factor that has been shown to influence performance
is a finger force applied on the sensor [9]. It is also worth noting that all scanners in
GUC100 database was single finger, and consequently performances of individual fingers can be different with multi-finger scanners.
Another thing to note from these plots is that performances using Neurotechnology
software are better compared to the performances using NISTs software.
4 Conclusion
In this paper we presented our analysis on the topic of influence of finger type into
fingerprint verification performance. The two publicly available fingerprint verification
software packages, one from NIST (available for free) and one from Neurotechnology
(available for purchase), were applied on a fingerprint database consisting of over 70000
fingerprint images from 100 individuals. These database contains images from all 10
fingers of the person obtained on separate days during several months using five different fingerprint scanners. Our study indicates that performance with the little fingers
is less accurate compared to the performances with the other finger types. In addition
the best performance is being observed with either the thumb or index fingers. On average performance deterioration from the worst fingers (i.e. little) to the best fingers (i.e.
thumb/index) in terms of EER was in the range of 184%-1352% using two different
fingerprint verification software packages. Knowing such information can be useful for
decision-makers in order to select a relevant finger for their system.
Acknowledgment
This work is supported by funding under the Seventh Research Framework Programme
of the European Union, Project TURBINE (ICT-2007-216339). This document has
been created in the context of the TURBINE project. All information is provided as is
and no guarantee or warranty is given that the information is fit for any particular purpose. The user thereof uses the information at its sole risk and liability. The European
Commission has no liability in respect of this document, which is merely representing
the authors view.
References
1. International Biometric Group. Biometrics market and industry report 2009-2014 (2008),
http://www.biometricgroup.com/reports/public/market_report.php
(Last visit: 17.11.2008)
2. Maltoni, D., Maio, D., Jain, A.K., Prabhakar, S.: Handbook of Fingerprint Recognition.
Springer, Heidelberg (2009)
3. NISTs Fingerprint verification software,
http://fingerprint.nist.gov/NBIS/nbis_non_export_control.pdf
(Last visit: 14.10.2009)
4. Neurotechnologys VeriFinger 6.0., http://www.neurotechnology.com/
(Last visit: 14.10.2009)
5. Gafurov, D., Bours, P., Yang, B., Busch, C.: GUC100 multi-scanner fingerprint database for inhouse (semi-public) performance and interoperability evaluation. In: International Conference
on Computational Science and Its Applications (ICCSA) (2010),
http://www.nislab.no/guc100
6. ISO/IEC 19794-2:2005, Information technology - Biometric data interchange formats part
2: Finger minutiae data (2005)
7. Wayman, J.L.: Multi-finger penetration rate and ROC variability for automatic fingerprint
identification systems. In: National Biometric Test Center (Collected Works 1997-2000), pp.
177188 (2000)
8. Bundhoo, V., Park, E.J.: Design of an artificial muscle actuated finger towards biomimetic
prosthetic hands. In: 12th International Conference on Advanced Robotics (2005)
9. Kukula, E.P., Blomeke, C.R., Modi, S.K., Elliott, S.J.: Effect of human-biometric sensor interaction on fingerprint matching performance, image quality and minutiae count. International
Journal of Computer Applications in Technology (2009)
Center of Excellence in Information Assurance (CoEIA), King Saud University, Saudi Arabia
2
Department of Information Systems, College of Computer and Information Sciences,
King Saud University, Saudi Arabia
bilalkhan@ksu.edu.sa, kalghathbar@ksu.edu.sa,
mkhurram@ksu.edu.sa, amk.sa@me.com, azajaji@gmail.com
1 Introduction
Bots are automated programs that crawl through the web sites and make auto registrations. Automated bots target free email services, blogs and other online membership
websites. They cause various problems by signing in to multiple accounts, sending
junk emails and causing denial of service etc. CAPTCHA stands for Completely
Automated Public Turing test to tell Computer and Humans Apart, is used to prevent
automated bots from abusing online services.
Gimpy and EZ-Gimpy [7] are two well-known text based CAPTCHAs. Gimpy
uses an image of seven words from the dictionary and generates a distorted image of
the words to make it hard for the OCR to recognize (see figure 1a) [11]. The image is
then presented to the user. In order to be successful, the user has to read three words
from the image and enter them.
In contrast to Gimpy, EZ-Gimpy CAPTCHA is an image consisting of a single
word [11]. The image is distorted with different techniques (Figure 1b) to confuse
OCR.
In this paper, Arabic CAPTCHA is proposed which uses Arabic text image for securing online services from automated bots. Arabic text is more resistant to optical
character recognition due to its natural complexity [10].
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 817, 2010.
Springer-Verlag Berlin Heidelberg 2010
(a)
(b)
It is expected that the proposed CAPTCHA will be useful in protecting internet resources, being wasted by spam, in Arabic speaking countries where the trend of using
internet is growing fast. There are almost two dozen countries where Arabic is spoken
as a first language. These countries are listed in Table 1, including Iran where Persian
is spoken, a language that uses Arabic script. The table shows the usage of internet in
these countries in 2003 and 2010 [9]. It is evident that there is an increase in the number of internet users from 2003 to 2010. As shown in Figure 2, Iran is on the top
where the number of internet users exceeds 32.20 million in 2010, followed by Egypt
and Morocco where these figures are reaching to 12.50 million and 10.30 million,
respectively.
Apart from Arabic speaking regions, Arabic script is used in other languages, spoken by people in different parts of the world. These languages are Persian, Urdu,
Pashto, Sindhi, Punjabi, which are spoken in different parts of Asia. A majority of
these people are the users of internet.
In addition to Yahoo, Google, MSN and Instant Messengers, there are thousands of
websites that provides services in Arabic language to the Arabic speaking users.
These include websites from educational institutions, government organizations,
online shopping, non/semi government organizations and social networking websites.
In many of these websites, the user has to register itself before she uses services provided by these websites.
Many online service providers will take interest in providing services with our proposed security mechanism, hence optimizing the potential of internet in these countries. The proposed scheme not only prevents automated-bots but also facilitate the
user while interacting the authentication process.
The rest of the paper is organized as follows: section 2 is previous work. Section 3
explains Arabic script and weaknesses of Arabic OCR and section 4 is the proposed
Arabic CAPTCHA scheme. Section 5 concludes the paper.
2 Previous Work
Alta vista first practically implemented the idea of CAPTCHA to prevent automatedbots from automatically registering the web sites [1]. The mechanism behind the idea
was to generate slightly distorted letters and to present it to the user.
Thomas et al. proved that OCR is unable to recognize text that is handwritten or
closer to human handwriting [4] [3]. Therefore they proposed synthetic handwritten
CAPTCHAs.
10
B. Khan et al.
Table 1. Internet usage in Arabic speaking countries
Serial #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Country
Iran
Egypt
Morocco
Saudi Arabia
Sudan
Algeria
Syria
United Arab Emirates
Tunisia
Jordan
Kuwait
Lebanon
Oman
Qatar
Bahrain
Yemen
Palestine (West Bank)
Libya
Iraq
Eritrea
Mauritania
Internet Usage,
in Dec/2000
250,000
450,000
100,000
200,000
30,000
50,000
30,000
735,000
100,000
127,300
150,000
300,000
90,000
30,000
40,000
15,000
35,000
10,000
12,500
5,000
5,000
Internet Usage,
Latest Data (2009)
32,200,000
12,568,900
10,300,000
7,761,800
4,200,000
4,100,000
3,565,000
3,558,000
2,800,000
1,595,200
1,000,000
945,000
557,000
436,000
402,900
370,000
355,500
323,000
300,000
200,000
60,000
Fig. 2. Graph showing the internet usage in 2010 in Arabic/Persian speaking countries
Datta et al. used image recognition test to distinguish human from machine [6].
They proved that image based CAPTCHA is better than text based CAPTCHA. Their
work is based on hard AI problem of automated image matching and recognition [6].
Gupta et al. proposed the method of embedding numbers in text CAPTCHAs [5].
They state that the tagged numbers in the text confuses the OCR and make it unable
11
Fig. 3. A sample of Arabic text
Arabic letters have different shapes depending on its position in the word, i.e., initial, middle, final or isolated. In contrast to Latin, Arabic letters have different sizes
[12]. Moreover, several characters have different number of dots as well as different
positions of dots, i.e., above, below and in the middle of the character (see figure 4).
In addition, diacritics, e.g., shadda, maddah, hamza etc, are used in Arabic text [10].
12
B. Khan et al.
in which the characters or sub words are overlapped, poses a serious problem in the
segmentation stage.
There are many Arabic characters that have similar shapes, e.g., , , ,
, , , . The differences between these pairs of letters are the number
and position of dots. These similar characters and dots confuse OCR to recognize
characters correctly [15].
Keeping in view the weaknesses of the Arabic OCR, the following CAPTCHA
scheme is proposed.
The background and foreground colors have been selected that are indistinguishable. Such color combination looks good and is hard for the OCR to separate the
foreground from the background.
The program picks the number of letters randomly that ranges from 4 to 9. In addition to the number of letters, the type of font is selected randomly from out 52 different font types. And finally the font size is selected before the CAPTCHA image is
displayed to the user. So each time the generated CAPTCHA varies in features with
respect to number of characters, characters selected, and font type and font size. As
Arabic OCRs are font dependent, so recognizing different font types will be a problem. The word in figures 5 has six letters, whereas the images in figures 6 and 8 have
four and five letters respectively.
13
The font type shown in images in figure 7 complicates the job for the OCR. Because figures show that the original text in the image has its duplicate copy in the
form of shadow. As opposed to human user, the OCR will detect two words in the
image instead of one. Such feature confuses OCR in recognizing the characters of the
word.
Character overlapping in text is a good feature which makes the segmentation step
hard for the OCR. The program generates words with overlapping features. For example, in figure 8 letter Ra is overlapped with Teh and Ghen . In such
situation, OCR is unable to separate overlapped words and hence is unable to
recognize the character in the image.
Baseline detection helps OCR to solve CAPTCHA. Some techniques have been
used in our CAPTCHA generation scheme to make the baseline detection difficult. In
figure 9, the letter Ra is below and is above the baseline.
The program generates a unique CAPTCHA each time, which avoids the possibility of making database of the images by spammers.
In addition to font type, size and noise, another parameter that varies from image to
image is the position of image. The program changes the coordinates of the image
while displaying it as shown in figure 5,6,7,8 and 9. Such randomly placed words in
CAPTCHAs make the segmentation and recognition hard [14].
14
B. Khan et al.
15
Algorithm 2. Image generation with white background and blue foreground text.
Input: text string
Output: image generation
1.
Procedure
2.
A fixed size rectangle is selected.
3.
An image is generated within the rectangle
4.
Drawobject.backgroundcolor = white
5.
Randomly generate four points (P1, P2, P3, P4) with random horizontal and vertical
coordinates in the image.
6.
Select one font type out of 50 Arabic font types depending on the CAPTCHA
complexity level.
7.
Draw the input text within four selected points (coordinates).
8.
If CAPTCHA complexity level = easy then
9.
Size of text image = 60-70% of the whole image else if CAPTCHA complexity level
= medium then size of text image = 50-59% of the whole image else if CAPTCHA
complexity level = hard then size of text image =40-49% of the whole image size
10.
End if
11.
drawobject.foregoudcolor = blue
12.
end procedure
(a)
(b)
16
B. Khan et al.
After preprocessing, OCR places dots on each character for correct recognition.
There is the probability that the OCR places dot in the incorrect position or on the
incorrect character. We call this false positive.
False positive means OCR considers the placement of dots is in correct position
when in reality it is wrong. Suppose that the false positive rate for any character that
could have a dot is 10 then the false positive for each character in are 0,0,20
and 50. The graph in figure 11 shows the false positive rates for each character in four
words.
4.3 Readability of Arabic CAPTCHA
To find out how readable is our proposed CAPTCHA, a survey was conducted. Over
one hundred and fifty individuals participated in the survey, both male and female,
and their ages ranging from 17 to over 50 years. Each participant was presented with
fifty five different CAPTCHA images. From the analysis of the result of the survey, it
was found that it was very easy for the users to read all the images in 50 out of 52 font
types. For some font types the readability rate was 100%. In addition to font types, all
Arabic alphabetic letters were also very easy to read. There were only few letters
which were hard to read and were replaced by readers with similar looking letters.
Table 2 shows the letters that were replaced with similar looking letters. Despite the
fact that the images were distorted, the overall readability rate was 96.6%.
Table 2. List of characters difficult to read
Original
letter
Replaced
by
17
bots to break our CAPTCHA. The algorithm is efficient and the user does not have
any problem while interacting with the system. The proposed program is developed in
VB.Net, ASP.Net and JavaScript. It is comparatively useful and robust than Persian
CAPTCHA.
References
[1] Lillibridge, M.D., Abadi M., Bharat K., and Broder A. Z.: Method for selectively restricting access to computer systems. Us patenet 6195698. Applied April 1998 and approved
February 2001
[2] Ahn, L.V., Blum, M., Hopper, N.J., Langford, J.: CAPTCHA: using hard AI problems for
security. In: Proceedings of the 22nd International Conference on Theory and Applications of Cryptographic Techniques 2003, Warsaw, Poland (2003)
[3] Rusu, A., Thomas, A., Govindaraju, V.: Generation and use of handwritten CAPTCHAs.
International Journal on Document Analysis and Recognition 13(1), 4964 (2010)
[4] Thomas, A.O., Rusu, A., Govindaraju, V.: Synthetic handwritten CAPTCHAs. Journal of
New Frontiers on Handwriting Recognition 42(12), 33653373 (2009)
[5] Gupta, A., Jain, A., Raj, A., Jain, A.: Sequenced tagged CAPTCHA: generation and its
analysis. In: International Advance Computing Conference, India (March 2009)
[6] Datta, R., Li, J., Wang, J.Z.: Exploiting the human-machine gap in image recognition for
designing CAPTCHAs. IEEE Transactions on Information Forensics and Security 4(3)
(2009)
[7] Ahn, L.V.: Telling humans and computers apart or how lazy cryptographers do AI.
Communications of the ACM 47(2), 5760 (2004)
[8] Shirali-shahreza, M.H., Shirali-Shahreza, M.: Persian/Arabic Baffle text CAPTCHA.
Journal of Universal Computer Science 12(12), 17831796 (2006)
[9] Internet usage in the Middle East, source online
http://internetworldstats.com/stats5.htm (accessed on June 10, 2010)
[10] Zheng, L., Hassin, A.H., Tang, X.: A new algorithm for machine printed Arabic character
segmentation. Journal of Pattern Recognition Letter 25(15), 17231729 (2004)
[11] Mori, G., Malik, J.: Recognizing objects in adversarial clutter: breaking visual
CAPTCHA. In: Conference on Computer Vision and Pattern Recognition 2003, pp. 134
141 (2003)
[12] Moussaa, S.B., Zahourb, A., Benabdelhafidb, A., Alimia, A.M.: New features using fractal multi-dimensions for generalized Arabic font recognition. Journal of Pattern recognition letters 31(5), 361371 (2010)
[13] Al-Shatnavi, A., Omar, K.: A comparative study between methods of Arabic baseline detection. In: International Conference on Electrical Engineering and Informatics 2009, Malaysia, pp. 7377 (2009)
[14] Hindle, A., Godfrey, M.W., Holt, R.C.: Reverse Engineering CAPTCHAs. In: Proceedings of the 15th Working Conference on Reverse Engineering 2008, Washington, USA,
pp. 5968 (2008)
[15] Sattar, S.A., Haque, S., Pathan, M.K., Gee, Q.: Implementation Challenges for Nastaliq
Character Recognition. In: International Multi Topic Conference, IMTIC 2008, Pakistan,
pp. 279285 (2008)
[16] Jain, A., Raj, A., Pahwa, T., Jain, A., Gupta, A.: Overlapping variants of sequenced
tagged CAPTCHA (STC): generation and their comparative analysis. In: NDT 2009,
Ostrava (2009)
Abstract. This paper presents selective results of a survey conducted to find out
the much needed insight into the status of information security in Saudi Arabian
organizations. The purpose of this research is to give the state of information
assurance in the Kingdom and to better understand the prevalent ground realities. The survey covered technical aspects of information security, risk
management and information assurance management. The results provide deep
insights in to the existing level of information assurance in various sectors that
can be helpful in better understanding the intricate details of the prevalent
information security in the Kingdom. Also, the results can be very useful for information assurance policy makers in the government as well as private sector
organizations. There are few empirical studies on information assurance governance available in literature, especially about the Middle East and Saudi Arabia, therefore, the results are invaluable for information security researchers in
improving the understanding of information assurance in this region and the
Kingdom.
Keywords: Information Assurance, governance, empirical study, policy.
1 Introduction
Information is the life blood of the modern day organization [1] and thus life blood of
economy. It is the gelling agent that keeps an organization together as one whole and
all other resources are control due to it [2]. To make the best use of information it has
to be shared and there is a need to keep up with the ever increasing demand for sharing information even outside an organization, with the partners, customer, suppliers
and other stake holders. With the importance of information and its sharing comes the
necessity of securing it. The increase in connectivity has increased the exposure of an
organization, which is both good and bad. Good because grater market access is
available and bad because the increased risk of detrimental loss due to failure of information security. Since the connectivity is not going to be curtailed this increased
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 1828, 2010.
Springer-Verlag Berlin Heidelberg 2010
19
risk has made information security into a global concern both for public as well as
private sectors and even for governments [3]. With the prevalence of information
security incidents it is very important for organizations to work diligently towards
securing their information assets.
Information security breaches do happen and organization need to be aware of the
risks to their information and take proactive measures to secure them. It is vital to
proactively control such incidents. The awareness in an organization about severity of
the outcome of an information security breach along with the probability of it happening can keep an organization focused on assuring that its information is secure [4].
This is also supported by Hagen [5]. To provide true information security a comprehensive framework/theory is required that takes a holistic approach towards information security [6, 7] so as to built information security in the system from the start and
not as an afterthought. In this regards various frameworks have been proposed and in
vogue but this is a dynamic world and information assurance is a moving target [8].
Thus, continuous research is required to find out new models and new paradigms of
information assurance.
Reviewing the relevant literature it may be seen that information security researchers either focus on technology aspects (most of the time) or less commonly
adopt theories and frameworks from other fields - sociology, philosophy, psychology, management etc. modified with reference to information security to explain
the human aspect of security [9]. Some of the pertinent ones are [10-14]. To get to
the desired level of information security it is important to know the current existing
status of information security. For this reason survey and studies are conducted.
There has been a dearth of studies on the status of information security in the developing world but the problem is compounded in case of Middle East. There have not
been many studies with reference to Saudi Arabia. Abu Musa has explored the information technology governance [15] and implementation of CoBIT [16] while
Alnather [17] has provided a framework for understanding information security
culture in Saudi Arabia.
The studies discussed above are either focused on IT governance or on behavioral
aspect alone. The methodology used by them was to send the questionnaire to randomly selected respondents. Thus there was a need to combine the two aspects. Further, in order to get a better overall picture of information assurance in the Kingdom,
it was deemed necessary to get responses from various segments. Thus, it was decided
to take a stratified sample of the organizations in KSA. To improve the response rate,
instead of sending out the questionnaires, the participants were invited from selected
representative sample of organizations to attend a daylong workshop of which the
survey was an important component.
The rest of the paper is organized such that the section 2) describes the research
methodology and the sampling while section 3) gives details of some of the selected
results along with discussion on them to reflect the status of information assurance in
Saudi Arabia. It is followed by section 4) that includes conclusions as well as the
future research direction.
20
2 Research Methodology
The research method used was surveying. A total of 120 Saudi organizations representing the major stake holder in four key sectors were preselected and their CIOs/IT
Managers were invited to participate in a daylong workshop to discuss information
security and assurance. More than 70 people participated in the workshop. At the end
of the workshop participants were asked to fill-in the questionnaire. From the organization with more than one participant only one representative response was solicited.
Thus, a total of 53 valid responses were collected. Majority of the questions in it were
close-ended. Some open-ended questions elicited additional information based on
answer to previous questions. The respondents demographics are that 23(43.40%)
were from information security (IS) departments and 22 (41.51%) were from IT departments while 8 (15%) were from other departments. Further, 34 (64.15%) respondents were IT or IS Managers while 11 (20.75%) were non-managerial IT/IS staff.
2.1 Sample Distribution
The sampling technique used was stratified sampling where the population was divided into 4 groups based on the number of employees in the organizations. They
ranged from less than 100 to more than 1000 with two intermediate groups of 101 to
500 and 501 to 1000 employees. The sample distribution is given in Fig.1. Further,
because of the uniqueness of the requirements and needs of different organizations
the population was divided into 4 sectors. This grouping was based on the homogeneity of information security requirements within the groups (i.e. minimizing the
intra-class variation) along with unique differentiating needs of each group compared
to other groups (i.e. maximizing inter-class variation). The composition of the groups
was such that the Government sector included the ministries and other public sector
organizations that run a government but excluding defense related organizations. The
Defense sector included all military, non-military, and law enforcement organizations that are primarily tasked with safeguarding the country. The Bank sector comprised of the financial institutions while the rest of the organizations were clubbed
together as Private/Commercial sector. The sector-wise sample distribution is given
in Fig. 2. Since larger organization would have more profound impact on the information assurance status because of the number of employees as well as the amount
of budget that they can spend on information assurance, therefore, a larger share of
sample distribution was given to the larger organizations. Similarly, government
organizations have more financial resources to put in so their share in the sample was
also kept high.
The distribution of respondents as given in Fig. 1 and Fig. 2 clearly shows that the
objective of sampling as described above was achieved. About 2/3rd of the respondents were from large organizations with more than 1000 employees and a similar
ratio was for governmental organizations (both defense related and non-defense
related) in the sample.
Respondent Distribution
Orgnaization Size-wise
Above 1000
Employees
64%
1-100 Employees
0%
21
1-100 Employees
4%
101-500
Employees
15%
101-500 Employees
5001-1000 Employees
5001-1000
Employees
17%
Fig. 1. Respondents distribution based on the size of the organization that they represented
Further, the sampling was designed in this way because larger organizations are
more likely to have information security as a priority, likewise government organizations are keen to have greater information security and have the will and the resources
to invest in information security.
Defense
Bank
49%
17%
Private
0%
19%
Although the survey was quite comprehensive, yet for the sake of brevity not all
the results can be mentioned here. Selected results in the relevant areas are given that
would give a thorough picture of the whole situation. To that end the results from the
areas of information security policies, information security standards and risk management are included.
22
80%
89%
60%
82%
86%
83%
70%
100%
100%
75%
67%
40%
50%
42%
20%
0%
Government
Defense
Bank
Private
IS Policy
42%
70%
67%
50%
Enforced1
82%
86%
100%
75%
Revised 2
89%
83%
100%
100%
Fig. 3. The percentages are not absolute rather relative based on the previous question, i.e. the
percentages of Enforced (1) are for organizations that have IS policies and similarly the
percentages of Revised (2) are for the organizations that have IS policies Enforced not of
the total sample size e.g. in case of Government organizations 42% of the Government organizations in the sample had IS Policies, out of these 82% (1) enforced them out of which 89% (2)
revise them.
It is interesting to note that all the banks and 86% of the organizations that had an
IS policy were enforcing it as well. It means that the organizations that have an IS
policy are highly likely to enforce it as well. Thus, the presence of IS policy is a good
indicator of its implementation as well as the fact that these organizations are serious
about their security.
It might be worthy of note that over all among the organizations that enforce IS
policies, almost 9 out of 10 (92%) regularly revise it. Further, all the banks and private organizations that have an IS policy not only implement it but also revise it as
well. While some of the Government and Defense organization neither enforce IS
policies not revise them. Possible explanation for this might be that Government and
Defense organizations have developed IS policies and are enforcing them due to some
directive or regulation without being convinced of their necessity. While the banks
and private organizations only go for it only if they see it as a necessary part of remaining in the business and likely to be more serious about it. Thus, those among
them that have an IS policies not only enforce it but also revise it.
23
60%
40%
20%
0%
No intention to be
certified
The survey results (excluding those organizations that responded in Not Sure)
given in Fig. 4 show that 1 in 5 Saudi organizations (predominantly from large &
government sector) have ISO 27000 series (or ISO 17799) certification and an equal
number do not plan to have such certification while about 3 out of 5 are seriously
thinking about getting themselves certified perhaps may be as soon as within next 1
year.
No intention to be
certified
Plan to be certified
within 12 months
Government
Defense
Bank
Private
Org is ISO certified
Fig. 5. Status of ISO 27000 certification of Saudi organizations within each category
24
Further, the results given in Fig. 5 show that Banks have the largest share of the
ISO 27000 series (ISO 17799) certified companies while in Defense sector a few
organizations are thinking of getting certified but none of them among the sample is
ISO 27000 series (ISO 17799) certified.
A probable reason might be that the primary purpose of ISO 27000 series (ISO
17799) is to establish trust in an organizations processes with respect to information
security for the partners, stakeholders and customers. Since defense organizations,
unlike other organizations, have predetermined fixed interaction, they cannot select
who to take as partners, customers, and stakeholders and trust is not an issue per se
due to the this fixed nature of relationship, therefore, defense sector does not seem to
be much concerned about ISO 27000 series (ISO 17799) certification.
3.3 Risk Management
Risk assessment and management is an important aspect of IS. The important characteristics of having effective risk management are risk assessment policy and updating
of asset inventory and operations procedures. The results given in Fig. 6 show that
about 50% of the Saudi organizations have a risk assessment process in place and a
little less than 80% have an updated assets inventory while operations and procedures
seems to be the most important aspect to them thus more than 90% of them keep these
updated. Overall the banks have a better risk management with all of them keeping
their assets inventory updated while defense organizations are more concerned with
updated operations and procedures and all of the defense organizations in the sample
keep it up to date.
Risk Management
100%
80%
60%
40%
20%
0%
Overall
Government Defense
Bank
Private
Type of Organization
Fig. 6. Risk management status of Saudi organization based on risk assessment policy and
updated assets inventory and operations procedure
To manage risk it is essential to start with risk assessment. In this regard one the
important criteria is to see how strong is the actual security of an organization by
attempting to circumvent it i.e. running a penetration test. It is very important tool to
find out the effectiveness of information security controls. The survey results in Fig. 7
show that representatives from about 1 in five organization were not sure if penetration testing is done in their organizations or not. Excluding these the results show that
25
a little less than a third of the organizations did not do any penetration testing while a
third did it once a year and the rest; a little more than the third did it more than once
per year.
30%
33%
29%
20%
10%
0%
More than once per
year
None
Among the private sector 25% organizations were not sure if they did penetration
testing of their organization while in the banking sector not only that all the banks in
the sample were doing penetration testing at least once a year but also all of them
knew about it.
0%
100%
80%
22%
43%
50%
38%
17%
50%
60%
40%
78%
50%
20%
19%
33%
0%
Government
Defense
Bank
Once per year
Private
None
The results in Fig. 8 show that Banks are the most serious of all the categories in
terms of finding out the effectiveness of their information security controls as all of
them were performing penetration testing at least once a year as compared to only
half of the Defense organizations. This could be attributed to the fact that probably
Defense is such a critical sector that it cannot allow an outsider/third party to conduct
a penetration testing lest any secret information is leaked due to some security hole
26
thus found. Also, it seems to be against its interest that any other organization (the
penetration tester) to find out about security hole in its information security.
3.4 Information Security Concerns and Issues
Information security issues exist in all organizations but criticality of the information
dictates how much effort and resources are going to be put in by the organization to
secure it. Not all the information is equally critical. Similarly not all the vulnerabilities are treated equally. Rather the treatment is based on the threats that can exploit a
vulnerability and the likelihood of those threats to be realized. Based on this analysis
security measures are prioritized. The survey results show that having an antivirus is
the top most security priority while lack of well-trained information security personnel is the top most security concern for the Saudi organizations. Table 1 lists the top
five information security priorities and Table 2 lists the top five information security
concerns.
Table 1. List of top five information security priority of the Saudi organizations
Information Security Priorities
Antivirus
Data Loss Prevention
Firewall
Intrusion Detection and Prevention
Network Access Control
Rank
1st
2nd
3rd
4th
5th
Table 2. List of top five information security concerns of the Saudi organizations
Information Security Concerns
lack of well-trained information security personnel
lack of sufficient security policy with the organization
lack of sufficient practices for data protection
lack of good practices in password choices/updates/protection
lack of adherence to information security standards
Rank
1st
2nd
3rd
4th
5th
No
16%
Yes
21%
Yes
11%
52%
27
The survey results show that Saudi organizations are very serious in creating
awareness and providing information security since more than half of the sampled
organizations had specialized information security training for its employees; both IT
as well as non-IT staff as shown in Table 3, while about 16 % had no specialized
information security training either for IT staff or for non-IT staff.
4 Conclusion
This study has tried to meet the need for finding the present situation vis--vis Information Assurance in Saudi Arabian organizations. This is a pioneer study as not much
information is available in the literature about information assurance in the Kingdom.
The results provide an insight into the information security policies, risk management,
network access and security standards adopted by the organizations in the Kingdom.
The results presented seem to be very helpful for researcher and practitioners in getting to know the organizations information security requirements. The next step in
this research is to compare the findings with similar studies from other developed as
well as developing countries to ascertain the comparative level of information assurance in the Kingdom. Further, the researchers intend to develop a model to provide
conceptual explanation of the results to enhance the understanding information security, culture and organizational behavior in Saudi Arabian context. Also, the researchers intend to use the results to do a gap analysis between the desired level of information assurance and the existing situation so that remedial actions may be suggested to
bridge this gap.
References
1. Halliday, S., Badenhorst, K., Solms, R.V.: A business approach to effective information
technology risk analysis and management. Information Management & Computer Security 4,
1931 (1996)
2. Eloff, J.H.P., Labuschagne, L., Badenhorst, K.P.: A comparative framework for risk analysis methods. Comput. Secur. 12, 597603 (1993)
3. Corporate Governance Task Force: Information security governance: a call to action
(2004),
http://www.cyber.st.dhs.gov/docs/
Information_Security_Governance-A_Call_to_Action.pdf
4. Whitman, M.E.: Enemy at the gate: threats to information security. Communications of the
ACM 46, 9195 (2003)
5. Hagen, J.M., Albrechtsen, E., Hovden, J.: Implementation and effectiveness of organizational information security measures. Information Management & Computer Security 16,
377397 (2008)
6. Freeman, E.H.: Holistic Information Security: ISO 27001 and Due Care. Information Systems Security 16, 291294 (2007)
7. Hong, K., Chi, Y., Chao, L.R., Tang, J.: An integrated system theory of information security management. Information Management & Computer Security 11, 243248 (2003)
8. Dlamini, M., Eloff, J., Eloff, M.: Information security: The moving target. Computers &
Security 28, 189198 (2009)
28
9. Siponen, M.T., Oinas-Kukkonen, H.: A review of information security issues and respective research contributions. SIGMIS Database 38, 6080 (2007)
10. Summerfield, M.: Evolution of Deterrence Crime Theory (2006),
http://mobile.associatedcontent.com/article/32600/
evolution_of_deterrence_crime_theory.html
11. Straub, D.W.: Effective IS Security: An Empirical Study. Information Systems Research 1,
255276 (1990)
12. Stanfford, M.C., Warr, M.: A Reconceptualization of General and Specific Deterrence.
Journal of Research in Crime and Delinquency 30, 123135 (1993)
13. Siponen, M.: A conceptual foundation for organizational information security awareness.
Information Management & Computer Security 8, 3141 (2000)
14. Leonard, L.N.K., Cronan, T.P., Kreie, J.: What influences IT ethical behavior intentions:
planned behavior, reasoned action, perceived importance, or individual characteristics? Information and Management 42, 143158 (2004)
15. Abu-Musa, A.A.: Exploring Information Technology Governance (ITG) in Developing
Countries: An Empirical Study. International Journal of Digital Accounting Research 7,
71120 (2007)
16. Abu-Musa, A.A.: Exploring the importance and implementation of COBIT processes in
Saudi organizations: An empirical study. Information Management & Computer Security 17, 7395 (2009)
17. Alnatheer, M., Nelson, K.: A proposed framework for understanding information security
culture and practices in the Saudi context. In: Proceedings of the 7th Australian Information Security Management Conference, pp. 617. SECAU - Edith Cowan University, Australia, Perth, Australia (2009)
18. Siponen, M., Pahnila, S., Mahmood, M.: Compliance with Information Security Policies:
An Empirical Investigation. Computer 43, 6471 (2010)
19. Puhakainen, P., Siponen, M.T.: Improving employees compliance through information
systems security training: An action research study. MIS Quarterly 34 (2010)
Introduction
In todays society the demand for reliable verication of user identity is increasing. Conventional knowledge-based authentications such as passwords and PIN
codes can be easy and cheap in implementation but they possess usability limitations. For instance, it is dicult to remember long and random passwords/PIN
codes, and manage multiple password. Moreover, knowledge based authentication merely veries that the claimed person knows the secret and it does not
verify identity per se. In the contrary, biometric authentication, which is based
on measurable physiological or behavioural signals of human being, establishes
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 2938, 2010.
c Springer-Verlag Berlin Heidelberg 2010
30
D. Gafurov
an explicit link to the identity. Thus, it can provide more reliable user authentication compared to the password-based mechanisms as well as lacking aforementioned usability limitations associated with passwords. Various characteristics of
human beings have been proposed for biometrics. Traditional examples of such
characteristics include ngerprint, iris, retina, face, speaker recognition, writer
recognition, hand geometry, video based gait and so on.
Recent technological advances and widespread use of personal electronics enabled to explore a new physiological and behavioural characteristics of human
beings for biometric authentication. In the rest of paper we will refer to such
type of biometrics as emerging biometrics. In this paper we identify and discuss
the challenges and opportunities related to emerging biometrics. We do not focus on algorithmic specics of each proposed methods but rather aim to outline
main limitations and advantages of emerging biometrics. The rest of the paper is
organized as follow. After an overview of biometric system and its performance
evaluation metrics in section 2, we provide introduction to emerging biometric
modalities that have been proposed so far in section 3. Section 4 outlines the
challenges and opportunities related to emerging biometrics. Section 5 concludes
paper with a summary.
31
VERIFICATION
Feature extractor
ENROLLMENT
Pre processing
Sensor
Sensor
Database
Decision
Matcher
Threshold
target being in top n-closest matches [1]. There are also several performance
metrics that are used to indicate biometrics performance by a single value such
as:
Equal Error Rate (EER). The EER is a point in the DET curve where
FAR=FRR;
FRR at specic FAR;
GAR (Genuine Accept Rate) at specic FAR;
identication probability at rank 1 (or recognition rate).
In addition to them, there are also two other error types exists which are also
important (especially for real applications). These are FTE (Failure To Enroll)
and FTC (Failure To Capture).
Human biometrics can be classied into two groups (not necessarily disjoint
sets), physiological and behavioral. The rst group is based on stable physical
characteristics, while the second group uses learned, alterable behavioral characteristics. Most of emerging biometrics can be classied to the behavioural group.
Below list includes several examples of emerging characteristics that have been
proposed for considering as biometric modality.
Wearable Sensor (WS) based gait. Gait information is collected using sensors (e.g. accelerometer) attached to the body or clothes of the person. The
recorded motion signals is then analyzed for recognition purposes. Motions
from following body location were analyzed:
32
D. Gafurov
foot [2, 3]
hip [4, 5]
waist [6, 7]
pocket [8]
wrist [9]
Floor Sensor (FS) based gait. Gait information is captured using sensors installed on the oor when person walks on them [10].
Ballprint/Footprint. Verication is based on the shape and local ridge characteristics of the foot [11].
Typing style. This refers to the users typing style on computers or mobile
phones keyboard [12, 13, 14].
Mouse usage. This refers to the how user uses and interacts with a mouse
device when working with computer [15, 16, 17].
Brain signals. Electroencephalogram (EEG) is an electrical activity signal
along the scalp produced by neurons within the brain which is captured by
placing sensors around the head [18].
Heart signals. Electrocardiogram (ECG or EKG) is an electrical activity of
the heart over time externally recorded by sensors on the skin [19, 20, 21, 22,
23, 24].
Acoustic ear. Acoustic properties of the pinnar (outer ap of the ear) and
ear canal appears to provide some identity information [25].
Hipprint. Verication is based on pressure distribution captured by sensors
attached on the chair when person sits on them [26].
Fingerback. Unlike ngerprint, in this biometrics features (e.g. nger knuckle)
are extracted from back surface of the nger [27].
Lip. Verication is based on lips shape and color features [28].
Tounge. Verication is based on geometric shape and texture information of
the tongue [29, 30].
Lately emerging biometrics have attracted research attention. It is interesting to point out that there are even companies which started to provide security/authentication services based on emerging biometrics such as keystroke
[31, 32], foot motion [33], etc.
Table 1 shows performance accuracies of the emerging biometric modalities. Performances are given in terms of recognition rate, EER and FRR at specic FAR.
In this table the column #S represents the number of subjects in experiment of
the study. It is should be noted that direct comparison of performance (even for
the same modality) may not be adequate because of dierence in the collected
data sets.
33
Modality
EEG signal
WS-based (shoe)
Performance, %
#S
EER = 3.4
51, 36
F RR = 6.9 at F AR = 10
0.02
(foot)
EER = { 5-18.3 }
30
(waist) EER = 6.4
36
(waist) EER = {5.6, 21.1 }
21
(hip)
Rr = 93.1
6
(hip)
EER = 13
100
(wrist) EER = { 10-15 }
30
(pocket) EER = { 7.3-20 }
50
and F AR=4.4, GAR=99
41
Gafurov et al [2]
Ailisto et al. [6]
Rong et al. [7]
Sprager and Zazula [5]
Gafurov et al [4]
Gafurov and Snekkenes [9]
Gafurov et al [8]
Everitt and McOwan [15]
WS-based
WS-based
WS-based
WS-based
WS-based
WS-based
WS-based
mouse
keystroke
Schulz [16]
mouse
Ahmed and Traore [17]
mouse
Hocquet et al. [12]
keystroke
Clarke and Furnell [13]
keystroke
(on
phone)
Hosseinzadeh and Krish- keystroke
nan [14]
Campisi et al. [34]
keystroke
Uhl and Wild [11]
Ballprint/Footprint
Yamada et al. [26]
hipprint
Akkermans et al. [25]
acoustic ear
Zhang et al. [29]
tongue
Zhang et al. [30]
tongue
Choras [28]
lip
Kumar and Ravikanth [27] ngerback surface
4.1
EER
EER
EER
EER
={11.2-24.3}
=2.46
= 1.8
= 12.8
72
22
15
32
Subject population
EER = 4.4
41
age range 18-65 (av(30+11) erage 30.1)
EER = 13
30
age range 25-40
F RR=1.56 at F AR=0 32
F RR=9.2 at F AR=1.9 20
?
EER = { 1.5 - 7 }
31
GAR=93.3 at F AR=2.9 134
age range 20-49
(89+45)
EER =4.1, P1 =95
174
age range 20-49
(115+59) (young and middleaged)
P1 =82
38
EER = { 1.39 - 5.81 } 105
age range 18-60
(mostly 20-25)
There are several issues that needs to be addressed before emerging biometrics
can really nd their way into real applications. These include:
Subject population coverage:
In most of the current studies, in experiments subjects are mainly involved from
adult population. This limits performance generalization only to this group of
population. There is not much research of emerging biometrics with small/young
children and old people. It is evident that most of the aforementioned emerging
biometrics change signicantly when individual grows from small/young age to
adultness and from adultness to old ages. Some of them are not even applicable
to young/old populations. For example, small children are not accustomed to use
keyboard or mouse while muscles get weak in old ages. Furthermore, no FTE or
FTC is reported in the studies. Thus, emerging biometrics needs to be studied
with these two populations and then probably approximate age borders of their
applicabilities needs to be identied. It is worth to note that population coverage
of biometrics is not restricted only to emerging biometrics, it is related to the
34
D. Gafurov
traditional biometrics (e.g. face) as well. This can be one of the big challenges
for biometrics in years ahead due to aging population of the earth (especially in
the developed countries) [35].
Benchmark database:
Unlike conventional biometric modalities where well established benchmark
databases are available (e.g. FVC200x databases for ngerprint, USF data set
for video based gait [36] etc.), there is not much benchmark database for emerging biometrics (to our best knowledge only [37] dataset for keystroke biometrics). Although various studies of emerging biometrics reported encouraging
performance results, most of them are based on restricted home collected
databases. In other words, data collections are not conducted in similar conditions/environment which limits direct comparison of algorithms, and most of
collected data sets are not public due to privacy regulations. Therefore, in order to advance in the area of emerging biometrics the availability of publicly
accessible database is essential.
Hardware/Sensor:
Some types of emerging biometrics require special kind of sensors for capturing
biometric data. Such sensors can be expensive or inconvenient in daily usage.
For instance, for collecting EEG signals of the brain one needs sensor to place
around head. Unless such sensors are integrated with clothing (e.g. caps) it is
considered as an inconvenient and obtrusive data collection which likely to result
in low acceptance by user.
Security:
Many reported papers on emerging biometrics model impostor as being passive.
In other words an impostor score is generated by comparing a normal biometric of one subject to a normal biometric of another subject. Such type of
performance evaluation is referred as friendly scenario [4]. However, in real
life scenarios assumption of impostor being passive is not adequate. Impostor
needs to be modeled as active and performance evaluation need to be conducted
in hostile scenario [4]. In hostile scenario attackers can perform some actions
to increase their acceptance chances (e.g. by observing and training of typing
behaviour or walking style) or even can posses some vulnerability knowledge
about authentication techniques. For instance, in gait biometrics it appears that
knowing the gender of users in the database can be helpful for attackers to increase their acceptance chances [38]. Thus, it is important to study emerging
biometrics robustness against active and motivated attackers in order to be able
to answer questions like whether it is possible to learn someone elses typing
rhythm or mouse usage by learning and training, how much eort is required
to achieve this, etc.
In addition to the above, another widely discussed topic with conventional
biometrics in general and emerging biometrics in particular is the privacy issue.
Since the technology is based on human biology it is claimed that various types
of sickness can be identied from biometric sample. To address privacy aspects
in biometrics some approaches have been proposed [39, 40].
4.2
35
Opportunities
Once the identied challenges and open issues of emerging biometrics are addressed properly they can bring advantages and benets over traditional biometrics. Although emerging biometrics cannot replace the conventional biometrics, it
does not prevent them to serve as complimentary source for identity information.
In addition, emerging biometrics can have following advantages and benets.
Hardware/Sensor:
In previous subsection we pointed out that special sensor requirement can be a
limitation for few types of emerging biometrics. However, several other types of
emerging biometrics do not require an additional hardware for capturing biometric characteristics, since such hardware is already part of the system per se. For
instance in case of keystroke and mouse dynamics, the keyboard and mouse device are already standard input devices of any computer, and most of the mobile
phones (except touch one) have keyboard. Likewise, various motion detecting
and recording sensors have been integrated into some models of mobile phones,
e.g. Apples iPhone [41] has an accelerometer sensor for detecting orientation of
the phone.
Liveness: One of the challenges of traditional biometrics is assuring biometric
sample is coming from live person i.e. liveness detection. For example, some
ngerprint scanner cannot distinguish whether presented nger is fake or real
one. Many of the mentioned emerging types implicitly assume that signals are
coming from live person, e.g. no heart/brain signals can be generated by dead
person.
Application scenarios: One of the main motivation of emerging biometrics are
being to be better suitable in some application scenarios or environments like
ubiquitous and pervasive. Some of them can be suitable for continuous authentication. For instance, when motion recording sensors are integrated in the mobile
phone the identity of the phones owner can be veried periodically to ensure its
identity throughtout phone usage.
Conclusion
In this paper we gave an overview of emerging biometric modalities and techniques. We identied and discussed several limitations, challenges or open issues
associated with emerging biometrics which can include population coverage (i.e.
only adults), lacking benchmark databases, security with respect to active impostors and privacy concerns. In order for emerging biometrics to advance, these
challenges need to be addressed. In addition, short survey of performances indicate that recognition accuracies of such biometrics are not as accurate as to
be used alone. Despite of all of these, emerging biometrics can always serve as
a complementary source of information for providing identity information. In
addition, some of such biometrics do not require extra sensor and can be very
suitable for continuous or periodic re-verication.
36
D. Gafurov
References
1. ISO/IEC IS 19795-1: Information technology, biometric performance testing and
reporting, part 1: Principles and framework (2006)
2. Gafurov, D., Snekkenes, E.: Towards understanding the uniqueness of gait biometric. In: IEEE International Conference Automatic Face and Gesture Recognition
(2008)
3. Yamakawa, T., Taniguchi, K., Asari, K., Kobashi, S., Hata, Y.: Biometric personal
identication based on gait pattern using both feet pressure change. In: World
Automation Congress (2008)
4. Gafurov, D., Snekkenes, E., Bours, P.: Spoof attacks on gait authentication system.
IEEE Transactions on Information Forensics and Security 2(3) (2007) (Special Issue
on Human Detection and Recognition)
5. Sprager, S., Zazula, D.: Gait identication using cumulants of accelerometer data.
In: 2nd WSEAS International Conference on Sensors, and Signals and Visualization, Imaging and Simulation and Materials Science (2009)
6. Ailisto, H.J., Lindholm, M., M
antyj
arvi, J., Vildjiounaite, E., M
akel
a, S.-M.: Identifying people from gait pattern with accelerometers. In: Proceedings of SPIE.
Biometric Technology for Human Identication II, vol. 5779, pp. 714 (2005)
7. Rong, L., Jianzhong, Z., Ming, L., Xiangfeng, H.: A wearable acceleration sensor
system for gait recognition. In: 2nd IEEE Conference on Industrial Electronics and
Applications (ICIEA) (2007)
8. Gafurov, D., Snekkenes, E., Bours, P.: Gait authentication and identication using
wearable accelerometer sensor. In: 5th IEEE Workshop on Automatic Identication
Advanced Technologies (AutoID), Alghero, Italy, June 7-8, pp. 220225 (2007)
9. Gafurov, D., Snekkenes, E.: Arm swing as a weak biometric for unobtrusive user
authentication. In: IEEE International Conference on Intelligent Information Hiding and Multimedia Signal Processing (2008)
10. Jenkins, J., Ellis, C.S.: Using ground reaction forces from gait analysis: Body mass
as a weak biometric. In: International Conference on Pervasive Computing (2007)
11. Uhl, A., Wild, P.: Personal identication using eigenfeet, ballprint and foot geometry biometrics. In: IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS) (2007)
12. Hocquet, S., Ramel, J.-Y., Cardot, H.: Fusion of methods for keystroke dynamic
authentication. In: Fourth IEEE Workshop on Automatic Identication Advanced
Technologies (2005)
13. Clarke, N.L., Furnell, S.M.: Authenticating mobile phone users using keystroke
analysis. International Journal of Information Security, 1-14 (2006) ISSN:1615-5262
14. Hosseinzadeh, D., Krishnan, S.: Gaussian mixture modeling of keystroke patterns
for biometric applications. IEEE Transactions on Systems, Man, and Cybernetics,
Part C: Applications and Reviews (2008)
15. Everitt, R.A.J., McOwan, P.W.: Java-based internet biometric authentication system. IEEE Transactions on Pattern Analysis and Machine Intelligence (2003)
16. Schulz, D.A.: Mouse curve biometrics. In: Biometric Consortium Conference
(2006)
17. Ahmed, A.A.E., Traore, I.: A new biometric technology based on mouse dynamics.
IEEE Transactions on Dependable and Secure Computing (2007)
37
18. Riera, A., Soria-Frisch, A., Caparrini, M., Grau, C., Runi, G.: Unobtrusive biometric system based on electroencephalogram analysis. EURASIP Journal on Advances in Signal Processing (2008)
19. Biel, L., Pettersson, O., Philipson, L., Wide, P.: ECG analysis: a new approach in
human identication. In: 16th IEEE Instrumentation and Measurement Technology
Conference (1999)
20. Biel, L., Pettersson, O., Philipson, L., Wide, P.: ECG analysis: a new approach
in human identication. IEEE Transactions on Instrumentation and Measurement
(2001)
21. Irvine, J.M., Israel, S.A.: A sequential procedure for individual identity verication
using ECG. EURASIP Journal on Advances in Signal Processing (2009)
22. Boumbarov, O., Velchev, Y., Sokolov, S.: ECG personal identication in subspaces
using radial basis neural networks. In: IEEE International Workshop on Intelligent
Data Acquisition and Advanced Computing Systems: Technology and Applications
(2009)
23. Fatemian, S.Z., Hatzinakos, D.: A new ECG feature extractor for biometric recognition. In: 16th International Conference on Digital Signal Processing (2009)
24. Micheli-Tzanakou, E., Plataniotis, K., Boulgouris, N.: Electrocardiogram (ECG)
biometric for robust identication and secure communication. Biometrics: Theory,
Methods, and Applications (2009)
25. Akkermans, A.H.M., Kevenaar, T.A.M., Schobben, D.W.E.: Acoustic ear recognition for person identication. In: Fourth IEEE Workshop on Automatic Identication Advanced Technologies (2005)
26. Yamada, M., Kamiya, K., Kudo, M., Nonaka, H., Toyama, J.: Soft authentication
and behavior analysis using a chair with sensors attached: hipprint authentication.
Pattern Analysis & Applications (2009)
27. Kumar, A., Ravikanth, C.: Personal authentication using nger knuckle surface.
IEEE Transactions on Information Forensics and Security (2009)
28. Choras, M.: The lip as a biometric. Pattern Analysis & Applications (2010)
29. Zhang, D., Liu, Z., Yan, J.q., Shi, P.f.: Tongue-print: A novel biometrics pattern.
In: 2nd International Conference on Biometrics (2007)
30. Zhang, D., Liu, Z., Yan, J.q.: Dynamic tongueprint: A novel biometric identier.
Pattern Recognition (2010)
31. bioChec, http://www.biochec.com/ (Online accessed: 13.04.2010)
32. Authenware Corp., http://www.authenware.com/ (Online accessed: 13.04.2010)
33. Plantiga Technologies Inc., http://www.plantiga.com/
(Online accessed: 13.04.2010)
34. Campisi, P., Maiorana, E., Lo Bosco, M., Neri, A.: User authentication using
keystroke dynamics for cellular phones. IET Signal Processing (2009)
35. Mann, W.C.: The aging population and its needs. IEEE Pervasive Computing
(2004)
36. Nixon, M.S., Tan, T.N., Chellappa, R.: Human Identication Based on Gait.
Springer, Heidelberg (2006)
37. Giot, R., El-Abed, M., Rosenberger, C.: GREYC keystroke: A benchmark for
keystroke dynamics biometric systems. In: IEEE 3rd International Conference on
Biometrics: Theory, Applications, and Systems (2009)
38. Gafurov, D.: Security analysis of impostor attempts with respect to gender in gait
biometrics. In: IEEE International Conference on Biometrics: Theory, Applications
and Systems (BTAS), Washington, D.C., USA, September 27-29 (2007)
38
D. Gafurov
39. Yang, B., Busch, C., Gafurov, D., Bours, P.: Renewable minutiae templates with
tunable size and security. In: 20th International Conference on Pattern Recognition
(ICPR) (2010)
40. Bringer, J., Chabanne, H., Kindarji, B.: Anonymous identication with cancelable
biometrics. In: International Symposium on Image and Signal Processing and Analysis (2009)
41. Apples iphone with integrated accelerometer,
http://www.apple.com/iphone/features/index.html (Last visit: 09.04.2008)
Introduction
40
(mod n) .
P + Xi1 if Xi1 S1 ,
Xi = 2Xi1
if Xi1 S2 ,
Q + Xi1 if Xi1 S3 .
Since it is known that the Pollards original walk does not achieve the performance of a random walk in fact, Teske [7] [8] [9] studied some better random
walks such as modied walks, linear walks with r-adding, and combined walks.
Throughout this paper we assume that Xi+1 = f (Xi ) is a random walk.
Oorschot and Wiener [5] showed how the Pollards rho method can be parallelized with linear speed up. In their method, a set of distinguished points D of
E(Fq ) is selected. Each client calculates a sequence {Xi } by a specied random
walk until it nds an Xi D. This Xi and its associated (ai , bi ) are submitted to
central server and the client starts again from a new starting point. The central
41
server stores all the submitted points until a point Xi is received twice and the
private key d is calculated. Oorschot and Wiener [5] proved that assuming each
processor is the same speed, the overall running time T of the algorithm satises
n/2 1
+
t,
(1)
E[T ] =
m
m ,
t
where t is the time required in a client side to calculate the next point Xi from
Xi1 , since the required numbers of iterations to obtain a distinguished point
are geometrically distributed with mean 1/. That is, N,m ( ) has the Poisson
distribution with parameter , where = m/t.
Let 0 , 1 , be given by
0 = 0, j = inf{ : N,m ( ) = j} .
Then j is the time of the j-th arrival. The interarrival times are the random
variables I1 , I2 , given by Ij = j j1 . It is well known that I1 , I2 ,
are independent and each having the exponential distribution with parameter .
Let S(N ) be the run-time of the search algorithm for N elements in the central
server, then the communication overload is dened as follows:
We say that a communication overload is occurred, if Ij < S(j 1) for some
positive integer j.
If the binary search algorithm is used in the central server, note that S(N )
S(N ), since the time complexity of binary search algorithm is O(log2 N ).
Now we can drive a reasonable method to determine an optimal proportion
from the results of a simulation attack on an elliptic curve with smaller order than
42
t
N ( , m ) = max N : E [S(N )] E[I ] = ,
m
where t is the time taken to calculate one iteration of the random walk function
for the ECDLP of l -bit long order. Note that we can obtain N ( , m ) by a
simulation only for the search algorithm used in the server side.
Now it is possible that we determine the optimal for the original given
ECDLP of l-bit long order with m processors.
Theorem 1. Assume that the binary search algorithm is used for nding a
collision in the central server, then the optimal proportion of the distinguished
points set which doesnt cause a communication overload is given by
m t
2
(N ( , m )) mt
.
= max :
n
Proof. If the binary search algorithm is used for nding a collision in the server,
by the denition of N ( , m ), we can drive the following formula:
t
E[S(N ( , m ))] =
m
m
E S (N ( , m )) t
=1,
m t
m
E S (N ( , m )) mt
=1,
t
m t
t
.
=
E S (N ( , m )) mt
m
We have to keep the inequality
t
E S n/2 E[I] =
m
(2)
(3)
43
2
Therefore it is reasonable that we choose the optimal as follows:
m t
2
= max :
(N ( , m )) mt
n
For any xed positive integer k, let Dk be a distinguished subset of E(Fq ) such
that X Dk if the k most signicant bits in the representation of X as a binary
string are zero. If the order n of the point P E(Fq ) is l-bit long, the proportion
of Dk is 2lk /2l = 1/2k . If we select the set of all distinguished points as
Dk , then by Theorem 1, we get the following fact.
Corollary 1. If we consider the form of Dk as the set of all distinguished points,
then k corresponding to is given by
1
k = min k Z + : k (l 1 + log2 )
2
m t
2kk
log2 N ( , m ) ,
mt
where k is the positive integer corresponding to and Dk .
Proof. It is straightforward that k is determined by the above formula, since
= 1/2k and = 1/2k in Theorem 1.
On the other hand, the eciency of the search algorithm can be improved, if we
divide the storage space of the central server into some partitions. Let Sr (N ) be
the run-time of the search algorithm for N elements where the storage space is
divided into r partitions of equal size. Then we know that Sr (rN ) S1 (N ) =
S(N ). This relation produces the following fact.
Corollary 2. If the storage space of the central server is divided into r partitions
of equal size, then k corresponding to is given by
1
k = min k Z + : k (l 1 + log2 )
2
m t
2kk
log
rN
(
,
m
)
log
r
.
2
2
mt
Proof. If we divide the storage space into r partitions of equal size, N ( , m ) is
increased to rN ( , m ). Thus, by the same argument as the proof of Theorem 1,
we obtain that
44
= max :
m t
m t
2
(N ( , m )) mt r mt 1
n
m t
2kk
log
N
(
,
m
)
.
2
mt
Experimental Results
To validate our algorithm, we conducted experiments in a computer cluster system with 50 processors of 3.2GHz Xeon CPUs and 2.0GB RAM connected by
Cisco-Linksys 24p SRW2024 networks. The messaging API was MPI. We xed
an ECDLP of 50-bit (l = 50) long order and m = 25 in order to nd k in the
rst step of Algorithm 1. We obtained k = 10 from the average value of 100
independent running time results as Figure 1, where t = 0.018 msec. In the
second step of Algorithm 1, we got N (k , m ) = 10, 000 by testing the search
algorithm used in the server side. Then by the third step of algorithm, we can
obtain the result that k = 12 ( = 1/212 ) is the optimal value for an ECDLP
45
Fig. 1. Performance of parallel collision search attacks on ECDLP of 50-bit long order
according to proportions of distinguished points
Fig. 2. Performance of parallel collision search attacks on ECDLP of 60-bit long order
according to proportions of distinguished points
of 60-bit long order, where t = 0.020 msec. Figure 1 and 2, respectively, show
the average performance of 30 independent trials for the parallel Pollard-rho
attack for ECDLPs of 50-bit and 60-bit long order with respect to proportions
of distinguished points. It can be seen that = 1/210 and = 1/212 are
46
Conclusion
We have dealt with a practical issue for an implementation of the parallel collision
search attack on ECDLP. To perform this attack, each processor repeatedly nds
a distinguished point and adds it to a single common list. Thus the proportion
of the distinguished points must be determined cautiously by considering both
practical running time and communication overload. In this paper we proposed
a practical method to determine an optimal proportion of the distinguished
points by taking account of both communication overload and computational
eciency under a given implementation environment. Our result is dierent
from the previous theoretical fact.
References
1. Certicom ECC Challenge, www.certicom.com
2. Gallant, R., Lambert, R., Vanstone, S.: Improving the parallelized Pollard lambda
search on binary anomalous curves. Mathematics of Computation 69, 16991705
(1999)
3. Koblitz, N.: Elliptic curve cryptosystems. Mathematics of Computation 48, 203
209 (1987)
4. Miller, V.: Uses of elliptic curves in cryptography. In: Williams, H.C. (ed.)
CRYPTO 1985. LNCS, vol. 218, pp. 417426. Springer, Heidelberg (1986)
5. van Oorschot, P., Wiener, M.: Parallel collision search with cryptanalytic applications. Journal of Cryptology 12, 128 (1999)
6. Pollard, J.: Monte Carlo methods for index computation mod p. Mathematics of
Computation 32, 918924 (1978)
7. Teske, E.: Better random walks for Pollards rho method, Research Report CORR
98-52, Department of Combinatorics and Optimization, University of Waterloo,
Canada (1998)
8. Teske, E.: Speeding up Pollards rho method for computing discrete logarithms. In:
Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 541554. Springer, Heidelberg
(1998)
9. Teske, E.: On random walks for Pollards rho method. Mathematics of Computation 70, 809825 (2000)
10. Wiener, M., Zuccherato, R.: Faster attacks on elliptic curve cryptosystems. In:
Tavares, S., Meijer, H. (eds.) SAC 1998. LNCS, vol. 1556, pp. 190200. Springer,
Heidelberg (1999)
Center of Excellence in Information Assurance (CoEIA), King Saud University, Saudi Arabia
2
Faculty of Computer Sciences, Institute of Business Administration, Karachi, Pakistan
3
Namal College, Mianwali, Pakistan
4
Department of Management, School of Business, University of Sharjah
5
Department of Information Systems, College of Computer and Information Science,
King Saud University, Saudi Arabia
syedirfan@ksu.edu.sa
Abstract. The amount of data communications is increasing each day and with
it comes the issues of assuring its security. This research paper explores the information security management issues with respect to confidentiality and integrity and the impact of Information Security Management Standards, Policies
and Practices (ISMSPP) on information security. This research has been conducted on the telecommunication industry of Pakistan that was ranked 9th globally in 2009 in terms of subscription. The research methodology was case study
based in which perceptions were gathered as well as thematic analysis of the
interviews was done. The research focus is on breach of data integrity and confidentiality by the internal users in the industry and the perception of improvement, if any, of the data security due to implementation of security management
policies and controls. The results show that information security measure are
perceived to have a positive impact on reducing data confidentiality and integrity breaches but still falls short of what is required. It concludes that security
policies might improve the situation provided, firstly, that the top managements
takes information security seriously, and secondly, the non-technical human aspects of the issues are taken into consideration.
Keywords: Information security, confidentiality, integrity, telecommunications,
policies and practices, human.
1 Introduction
Telecommunications industry has grown tremendously in the last couple of decades.
With this rapid expansion, the issues of securing the communications have grown as
well. Three major characteristics of information that have remained as the basis of all
information security are confidentiality, integrity and availability. Information security
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 4756, 2010.
Springer-Verlag Berlin Heidelberg 2010
48
2 Background
2.1 Information Security
Information security is concerned with preserving the important characteristics of
information and among them the three basic ones are confidentiality, integrity and
availability.
2.1.1 Data Confidentiality, Integrity and Availability
Data confidentiality is making data available only to those that are approved to have it
while data integrity is defined as ensuring that the data is complete and accurate [2]. It
is also defined as ensuring that data cannot be corrupted or modified and transactions
cannot be altered [3]. Data availability may be defined as data and information systems being always ready when you need them and perform as required [4]. Leaving
out the important human aspect just the technical controls alone may not provide
effective and complete security [5].
2.2 Information Security and Telecommunications Industry
Telecommunication enables easy access to and dissemination of information. However, no telecommunication system or technology has yet been able to guarantee that
the information available or provided is complete and has not been tampered with
either intentionally or accidentally [6].
2.3 Information Security Problems and Issues in Telecommunications
The telecommunication systems have evolved into very complex structures within
themselves. These systems interconnect with various other such systems to provide
wider telecommunications coverage to the users. Thus, the augmented complexity has
given rise to newer categories of data integrity risks [7]. Also, while expanding telecommunication, the companies usually focus on introducing new technologies and
systems. The haste in implementing these upgraded and improved technologies and
systems often leads to neglecting information security concerns. The companies are
usually so concerned about time-to-market and capturing new opportunities that
they disregard the data integrity issues which can potentially arise [6].
49
50
Company Name
Brand(s)
Operating
Since
Fixed line or
cellular
Market Share*
as of May 2010
Mobilink
1994
Cellular
32.64%
Zong
(formerly Paktel)
1991
Cellular
6.69%
Telenor
2006
Cellular
24.14%
Warid
2006
Cellular
16.98%
Ufone
2001
Cellular
19.55%
Instaphone
1991
Cellular
PTCL
1947
Fixed line
NTC
1995
Fixed line
~0.001%**
95.82%
as of March 2009
2.96%
as of March 2009
3
4
5
6
7
8
51
to customers, employees or others. Processing of personal data should be limited to what is needed for operational purposes, efficient customer care, relevant commercial activities and proper administration of human resources.
It may be seen that majority of the ISMSPP in vogue in different companies as given
in Table 2 are primarily based on ISO 17799 or ISO 27001. Although none of the
companies are certified (as of first quarter of 2009 when the research was conducted),
yet, they know the importance of these standards and are using them as guidelines.
Table 2. Information security standards and policies followed by some of the leading telecommunications companies in Pakistan based on the data collected through interviews
S. No.
Company
PTCL
Ufone
Warid
Arfeen
Telenor
NTC
Standard
Followed
ISO
17799
ISO
17799
ISO
27001
PTA
guidelines
ISO
27001
Other Guidelines
In-house developed
by Security Unit
In-house developed
by security group
IS employees trained
on ISO 27001
In-house developed
policies
ISO27006
ISO
17799
IS*
Awareness
In house
training
Regular
training
Remarks
Has a complete separate
IS* department
It is a wholly owned
subsidiary of PTCL
Reported to have a full
fledged IS* department
which doesnt exist now
No dedicated IS* setup
Has an ISO 27001
certified 100% owned
subsidiary.
IS* in initial stages and
has a long way to go.
* - Information Security.
The authors do not know of any related study about the telecommunications industry of Pakistan and thus it is very important to find out the confidentiality and integrity issues of this industry in Pakistan and to assess the role of standards, policies and
practices in it.
3 Research Question
The problems this research has tried to address are, firstly, the information security
issues in two areas; 1) confidentiality and 2) integrity; and secondly, the role of
ISMSPP in solving these issues. More specifically the researchers were interested to
find what are the data confidentiality and integrity issues in the telecommunications
industry of Pakistan and how effective the implementation of self-imposed ISMSPP
have been in ensuring security. This was to empirically verify the role of ISMSPP, as
suggested by Siponen [16].
4 Research Methodology
Since the population was very small - only eight companies - therefore a case study
approach was used to find value perception and to do thematic analysis based on
52
interviews. The major issues as found in the literature were used to develop interview
questions. These were further enhanced by discussing them with telecom experts. All
the eight companies in the target group were approached and two persons from each
organization were interviewed one for the perception of value (total of 08) and the
other for thematic analysis (total of 08). The interviews consisted of both open as well
as close ended questions and were about 30 minutes long. It was decided not to record
the interviews because the respondents felt that they would not be able to give candid
and straight answers to the questions or to discuss issues in depth with particular examples if the interviews were recorded. Certain comments and discussion were
termed off-line to be used only for better understanding by the researchers and not
to be quoted. Since it is a very small industry with only a handful of senior executive
and technical staff, therefore, in order to maintain the confidentiality of the respondents not only their names, but also their designations are kept undisclosed. The
interviews were done face-to-face except for the companies that are not based in Karachi the largest city of Pakistan and where the research was primarily conducted.
For companies outside Karachi telephonic interviews were done. The research was
conducted in the 1st quarter of 2009.
53
any information security breach, yet, these were considered as very effective. Perhaps
it is the risk of being caught even after the incident that is seen as a potent deterrent
preventing the employees from getting involved in any unacceptable behavior. Based
on deterrence theory this is an indication that the policies may assist in behavior
modifications if there is certainty and severity of punishment associated with it. Also,
Biometric access control is considered as effective in controlling threats but at the
same time few (2 out of 8) did not agree with it. It may be because biometric technology is relatively new in Pakistan and people are still not comfortable with it. Technology adoption model might be used to further research this issue.
Implementation of ISMSPP had a perception of positive effect on information security but their effect was perceived as more pronounced in improving integrity as
compared to confidentiality of information. As given in Figure 1, it was perceived that
the confidentiality breaches have not decreased considerably after the implementation
ISMSPP but nevertheless have lessened for some companies.
Table 3. Perception of data integrity breaches before & after the implementation of information
security policies, practices & standards
Always
Very
Often
Sometimes
Rarely
Never
Total
Before Implementation
After Implementation
54
Total
Table 6. Recommendation
Recommendations
1. In order to establish effective information security it is imperative that the top management provides all
out support in the form of budget, finances, resources so as to send a clear message that they mean to
establish information security in the organization.
2. Technical measures represent the seriousness of top management in establishing information security.
3. ISMS, policies, practices are very important in getting the whole organization onboard regarding the
criticality of information security.
4. The most sophisticated, state-of-the-art technical measures and most stringent and comprehensive
policies written in black and white cannot ensure effectiveness of the information security unless the
humans are taken on board and treated humanely.
5.3 Limitations
The authors could not find any relevant previous research to compare their results
with but future research can benefit from these results as a comparative analysis can
then be made. The WLL and LDI segment of the telecommunications industry was
not considered since these were nascent then but future research might include them
as well.
55
References
1. United Nations: Glossary of Recordkeeping Terms
2. ISO: ISO 17799
3. Weise, J.: Public Key Infrastructure Overview (2001),
http://www.sun.com/blueprints/0801/publickey.pd
4. The ShockwaveWriters Reply To Donn Parker. Computer Fraud & Security 2001, 1920
(2001)
5. Myler, E.E., Broadbent, G.: ISO 17799: Standard for Security. The Information Management Journal (2006)
6. Richman, S.H., Pant, H.: Reliability Concerns for Next-Generation Networks. Bell Labs
Technical Journal 12, 103108 (2008)
7. Prnjat, O., Sacks, L.E.: Integrity methodology for interoperable environments. Communications Magazine, IEEE 37, 126132 (1999)
8. Kramer, J.: The CISA prep guide: mastering the certified information systems auditor
exam. John Wiley and Sons, New Jersey (2003)
9. Hne, K., Elof, J.H.P.: Information security policy what do international information
security standards say? Computers & Security 21, 402409 (2002)
56
10. Visser, W., Matten, D., Pohl, M., Tolhurst, N.: The A to Z of corporate social responsibility. John Wiley and Sons, New Jersey (2008)
11. Siponen, M.: Information Security Standards Focus on the Existence of Process, Not Its
Content. Communications of the ACM 49, 97100 (2006)
12. Bodin, L.D., Gordon, L.A., Loeb, M.P.: Information security and risk management. Commun. ACM 51, 6468 (2008)
13. Albrechtsen, E.: A qualitative study of users view on information security. Computers &
Security 26, 276289 (2006)
14. Solms, B.V.: Information Security A Multidimensional Discipline. Computers & Security 20, 504508 (2001)
15. von Solms, B., von Solms, R.: The 10 deadly sins of information security management.
Computers & Security 23, 371376 (2004)
16. Siponen, M., Willison, R.: Information security management standards: Problems and solutions. Information & Management 46, 267270 (2009)
17. Kenneth, R.: Telecom industry: mobile firms getting 2 million subscribers every month
(2008), http://www.dailytimes.com.pk/default.asp?page=2008\03\07\
story_7-3-2008_pg5_9
18. Londesborough, R., Feroze, J.: Pakistan telecommunications report. Business Monitor International, London (2008)
19. Pakistan - Key Statistics, Telecom Market and Regulatory Overviews. Totel Pty Ltd.
(2010)
20. Global Wireless Data Market _- 2009 Update. Chetan Sharma Consulting (2010)
21. Moblie Cellular Services. Pakistan Telecommunications Authority (2010)
22. Fixed Line Services. Pakistan Telecommunications Authority (2010)
23. Telecommunication Rules 2000 (2000)
24. Codes of Conduct, Group Governance Document (2010),
http://www.telenor.com.pk/cr/pdf/newCOC.pdf
Introduction
It is very dicult to prevent all the attacks that occur when the weakness of a system is exploited. In addition, applying patches to compensate for vulnerabilities
is not sucient for preventing attackers from attacking computers. Therefore,
secure OSes have attracted attention as a solution to these problems. A secure
OS provides forced access control (Mandatory Access Control, MAC) and the
minimum special privileges so that minimal damage occurs even if the root privilege is obtained by an attacker. The secure OS is based on the label system or
path name system or others. There are several methods for implementing the
secure OS, and the functions in these methods are dierent. Therefore, it is not
easy to select a secure OS that is appropriate for use in the users environment.
In addition, the consequences of introducing a secure OS are not clear, and the
change in the specications and performance with the change of the version is
dicult to determine.
After Linux 2.6, the function of the secure OS is implemented by a hooking
system that calls a function group named Linux Security Modules (LSM) [1].
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 5767, 2010.
c Springer-Verlag Berlin Heidelberg 2010
58
We paid attention to LSM, and we implement the performance evaluation mechanism of the LSM-based secure OS; this mechanism is named the LSM Performance Monitor (LSMPMON)[2]. The LSMPMON records the processing time
at the each hook point of LSM and the calling count. Therefore, we can evaluate
the processing time for each hook and the point at which bottlenecks exist in the
secure OS. The monitor enables us to easily compare the performances achieved
by using secure OSes.
LSMPMON has been developed for evaluating LSM-based secure OSes. In this
paper, a new version of LSMPMON developed for 2.6.30 is described and results
of the evaluation of Security-Enhanced Linux (SELinux)[3][4], TOMOYO Linux
[5][6], and LIDS [7], which are representative secure OSes in Linux, are presented.
The unit of access control for resources in SELinux is label-based MAC, that in
TOMOYO Linux is path-name-based MAC, and that in LIDS is i-node-based
MAC. LSMPMON can be used to evaluate the inuence of dierent methods
of resource identication on performance. We evaluate the performance using
a benchmark software and report an analysis of the overhead incurred when a
secure OS and each LSM hook are used. We perform the same evaluation for
dierent versions of the kernel, namely, for the current kernel (2.6.30) and the
old kernel (2.6.19), in order to verify the changes in performance. As a result,
we clarify inuence on performance by using the secure OS, and a characteristic
by the dierence of the identication method of resources and a change of the
performance at the version interval.
The contributions of this paper are as follows:
(1) In this paper, a new version of the LSMPMON is described. All evaluation
methods for LSM-based secure OSes require the use of the LSMPMON. The
LSMPMON can record the processing time and the calling count of each LSM
hook. The results are useful for analyzing the performance of secure OSes.
(2) In this paper, rst, the dierence between dierent secure OSes is reported.
The reports are based on the evaluation results obtained by using LSMPMON.
The reports clarify the relation between the access control method and the performance of secure OSes.
(3) The dierence between dierent kernel versions of Linux is described.
These results reveal that the performance of the secure OS strongly depends on
the kernel version.
2
2.1
Security Focused OS
Evaluation of Secure OS
Secure OS indicates OS that has the function to achieve MAC and the minimum
privilege. In the secure OS, a security policy is enforced, according to which operations are limited to permitted operations that consume permitted resources;
further, access control is implemented in the root privilege. Therefore, a secure
OS can limit its operation, even if an attacker obtains root privileges via an
unauthorized access.
59
In addition, by the secure OS, it can authorize every user and process to the
minimum privilege. We evaluated the secure OS developed by using LSM that
is implemented on Linux. In particular, we used SELinux, TOMOYO Linux,
and LIDS. These secure OSes (refer to Figure 1) are used to explain the difference between the features and resource identication schemes. The resource
identication methods used in the secure OS involve the use of a label, a path
name, and an i-node number for management. In this example, the Web server
is the target to be accessed, the path name is /usr/sbin/httpd, and the i-node
number is 123, and the Web server is attached to the label called httpd t. The
resource is accessed as an object, the path name is /var/www/index.html, the
i-node number is 456, and it is attached to the label called web contents t.
SELinux is included in the Linux kernel standard as a secure OS. In SELinux,
oers increased safety since it is based on TE (Type Enforcement), MAC, and
RBAC (Role-Based Access Control). In SELinux, the label base method is used
for resource identication and control by assigning a label, which is called a
domain, and attaching a type to a process or a le. In the example shown in
Figure 1, it can be seen that the process in which a domain called httpd t was
added reads the le that was assigned a type called web contents t.
TOMOYO Linux is included in a Linux kernel standard as a secure OS. In
TOMOYO Linux, the path name base method is used for resource identication. In the example shown in Figure 1, it is understood that /usr/sbin/httpd
reads /var/www/index.html. In addition, the identication process is based on
knowledge of the path name and the execution history of the process.
LIDS uses the path name for setting the access control. On the other hand,
it uses an i-node number internally in order to manage resources.
2.2
LSM
LSM is a function that denes the hook system calls for function group to the
security check mechanism in the Linux kernel. A user can initially expand the
security check function of the kernel by using this function. After Linux2.6, the
LSM is incorporated in a kernel, and the function of the secure OS is often
implemented by the LSM. When LSM is valid, this is checking of the safety
before accessing an object of the kernel inside by the callback function of LSM
which is registered by user. The structure of the LSM is described below. When
an AP invokes a system call, the DAC performs a security check. Next, a hook
60
3
3.1
LSMPMON
Function
61
Enable LSMPMON
% echo 1 > /sys/kernel/security/lsmpmon/control
Disable LSMPMON
% echo 0 > /sys/kernel/security/lsmpmon/control
...
...
62
In the behavior example in Figure 3, min is the shortest processing time, max
is the longest processing time, ave is the average processing time for no context
switch, count is the calling count without a context swich, cs count is the calling
count of the no context switch.
4
4.1
We evaluated the secure OS on the basis of the following three criteria in order to
determine the performance of the secure OS and eects of using dierent Linux
kernels.
(A) Eect of introducing the secure OS on performance
(B) A characteristic by the dierence from the access control unit for resources
in the secure OS
(C) Changes in the performance and specications across dierent versions
4.2
Evaluation Methods
63
the results. Therefore, we compare the overhead in detail and show the change in
the performance and specications across versions. The evaluation environment
was as follows: CPU, Pentium 4 (3.0 GHz); Memory, 1 GB; OS, Linux 2.6.30.4
(new kernel) and Linux 2.6.19.7 (old kernel) is. In addition, all the measurements were obtained while running LMbench ve times, and the results show
the average processing time. The identication method for each secure OS and
the object that performs access control, the calling count of LSM hooks, and the
Secure OS version are shown in Table 1 and Table 2.
4.3
Table 3 shows the results of Evaluation 1. In SELinux, the rates of increase in the
processing time of stat, read/write, and 0K le creation are the highest among
the rates for the three secure OSes. In addition, in the other items of Table 3, the
rate of increase in the processing time is comparatively high. Thus, it is thought
that le processing involves a large overhead.
The processing time for the stat operation in TOMOYO Linux is the shortest
among that in the three secure OSes. On the other hand, the le creation and
deletion in this are much slower in this kernel than in the old kernel. Further, a
large overhead is incurred in open/close. Thus, the overhead in particular may
become large as the email server repeats the creation of a le.
The processing time of le generation and deletion in LIDS is shorter than
in the two other secure OSes. Thus, the rate of increase in the processing time
is small in the other items of Table 3. Therefore, it is thought that LIDS is the
most suitable OS for le processing.
4.4
64
Table 3. Processing time of le operations and its increase rate in Linux kernel 2.6.30
measured by LMbench (unit:s)
stat
open/close
read/write
0Kfile Create
0Kfile Delete
10Kfile Create
10Kfile Delete
Normal
SELinux
1.79 2.67 (48%)
2.74 3.94 (44%)
0.37 0.47 (29%)
15.16 58.18 (283%)
8.20 9.26 (12%)
47.84 89.38 (87%)
20.06 20.18 (0.6%)
TOMOYO
LIDS
1.83 (2%)
2.20 (22%)
4.78 (75%)
3.28 (20%)
0.37 (0%)
0.37 (0%)
53.58 (253%) 16.84 (11%)
33.10 (303%) 9.91 (21%)
83.32 (74%) 48.42 (1%)
42.68 (112%) 21.16 (5%)
of the label and to initialize each i-node that is newly created. In TOMOYO
Linux, the processing time increases because of the function based on path name,
such as path mknod and path unlink. It is thought that in order to check access
permissions, it is necessary to compare of the character string and to obtain the
path name.
In LIDS, there is no need to obtain the path name and to determine the label
on LSM hooks. Thus, the total processing time consumed by the LSM hooks is
short. It is thought that this is the factor because of which the overhead of the
whole secure OS is small.
From the above evaluations, the following results were obtained:
(1) In SELinux, the overhead of creating les is large. This is because labeling
is necessary. Further, other items have a relatively large overhead.
(2) In the case of for TOMOYO Linux, where the path name is distinguished,
the overhead is particularly large when the path name is obtained and deleted.
In addition, the open/close operation is slow because it is necessary to reference
the path name (compare the character string).
(3) LIDS has a small overhead in le processing because in LIDS, it is not
necessary to perform labeling and to obtain the path name.
4.5
65
Table 5. Processing time of le operations and its increase rate in Linux kernel 2.6.19
measured by LMbench (unit:s)
stat
open/close
read/write
0Kfile Create
0Kfile Delete
10Kfile Create
10Kfile Delete
Normal
2.64
3.98
0.58
11.76
6.25
34.34
16.14
SELinux
TOMOYO
LIDS
4.34 (64%)
2.63 (0%)
2.79 (6%)
6.45 (62%)
9.33 (134%) 3.96 (0%)
1.49 (157%) 0.58 (-1%)
0.58 (-1%)
42.80 (264%) 16.36 (39%) 14.00 (19%)
11.22 (80%)
9.61 (54%)
6.67 (7%)
69.26 (102%) 37.78 (10%) 35.48 (3%)
19.28 (19%) 19.02 (18%) 16.90 (5%)
processing time for stat, read/write, and 0K le create are still highest among
the three secure OSes.
In TOMOYO Linux, the performance of le creation and deletion of current
kernel has greatly deteriorated compared to that of the old kernel. On the other
hand, the rate of increase in the processing time for open/close has reduced.
Compared to the previous LIDS kernel, the processing time has increased for
many items. However, there is not the item where processing time extremely increases. In addition, the processing time for le creation and deletion is considerably shorter than the corresponding time required for the other two secure OSs.
Detailed evaluation of the overhead in each system call
On the basis of the results of Evaluation 3, we compared old and new kernel to
measure its performance in detail, especially in the case of items with respect
to which the performance of the old and new kernel dier greatly. Details of
Evaluation 4 and evaluations are provided below.
(Evaluation 4-1) File deletion in SELinux (close system call)
(Evaluation 4-2) File creation in TOMOYO Linux (creat system call)
By using the system call information of LSMPMON, we evaluated the processing time required by LSM hooks (s) for a system call. As in the past evaluation,
we performed the evaluation by running LMbench ve times.
Table 6 lists the results of Evaluation 4-1 A comparison of the new kernel with
the old kernel shows that the processing time of LSM hooks has increased in the
latter for many items. However, the processing time for sock rcv skb hook has
signicantly reduced. In the old kernel, this LSM hook has an especially large
overhead. In the new kernel, the processing time of LSM hooks has decreased,
resulting in the reduction of the overall processing time.
Table 7 lists the results of the Evaluation 4-2. The main overheads in the old
kernel are path mknod and dentry open, and the main overheads in the new
kernel is inode create. Both inode create and path mknod are LSM hooks for
acquiring path name. In the case of the old kernel, inode create can also acquire
a relative path, which had used its own implementation of TOMOYO Linux, to
acquire the absolute path. In the new kernel, the absolute pathname is obtained
from the root directory by path mknod. Therefore, the overhead of the LSM
hook function increases. In addition, dentry open is an LSM hook that is used
to reduce the number of times policy search has to be carried out; however, this
function in itself results in a large overhead. The change in the performance of
the LSM hooks results in a large overhead in the case of creat system call. We
66
ave
0.133
0.201
0.100
0.120
0.278
0.109
0.942
2.6.19
hookname
sum
count
file free
166246.612 1591046
inode free
2380.437
14669
inode delete
62.058
1074
sock rcv skb 11027.765
14583
task free
3.907
31
ave
0.104
0.162
0.058
0.756
0.126
Table 7. Creat system call in dierent version of TOMOYO Linux kernel (unit:s)
2.6.30
hookname
sum
count
inode permission
127585.395 3588000
31367.404 897000
file alloc
31240.724 897000
inode create
31425.973 897000
inode alloc
33046.422 897000
inode init security
32385.181 898000
d instantiate
12207508.304 897000
path mknod
22372903.197 897000
dentry open
0.138
2
cred free
average sum
ave
0.035
0.035
0.035
0.035
0.035
0.036
13.609
24.942
0.069
38.830
2.6.19
hookname
sum
inode permission
366962.831
file alloc
77799.981
inode create
5971299.965
inode alloc
72487.431
inode init security
74019.938
d instantiate
72418.537
task free
2.593
task kill
0.402
count
8124000
2031000
2031000
2031000
2031000
2031000
8
2
ave
0.045
0.038
2.940
0.036
0.036
0.035
0.324
0.201
arrive at the following conclusions on the basis of the above-mentioned evaluation results.
(1) In SELinux, there is a decrease in the overhead for le deletion because of
a decrease in the rate of increase in the processing time of specic LSM hooks.
In addition, it may be said that rate of increase in the processing time for all
items except le creation decreased, and the performance improved.
(2) In TOMOYO Linux, the rate of increase in the processing time for the
generation and deletion of a le is very large. This is because the function of
obtaining the path name that is a unit of access control, from its own implementation for changes the implementation that using LSM hooks. In addition, the
rate of increase in the processing time for open/close decreased.
(3) In LIDS, the rate of increase in processing time has increased in general; however, in the case of le generation and the deletion, rate of increase in
processing time is the still smallest among the three OSs.
Conclusion
67
References
1. Wright, C., Cowan, C., Smalley, S., Morris, J., Kroah-Hartman, G.: Linux Security
Modules: General Security Support for the Linux Kernel. In: Proceedings of 11th
Annual USENIX Security Symposium, pp. 1731 (2002)
2. Matsuda, N., Satou, K., Tabata, T., Munetou, S.: Design and Implementation of Performance Evaluation Function of Secure OS Based on LSM. The Institute of Electronics,Information and Communication Engineers Trans. J92-D(7), 963974 (2009)
3. NSA: Security-Enhanced Linux, http://www.nsa.gov/selinux/
4. Loscocco, P., Smalley, S.: Integrating Flexible Support for Security Policies into the
Linux Operating System. In: Proceedings of the FREENIX Track: 2001 USENIX
Annual Technical Conference, pp. 2942 (2001)
5. TOMOYO Linux, http://tomoyo.sourceforge.jp/
6. Harada, T., Handa, T., Itakura, Y.: Design and Implementation of TOMOYO Linux.
IPSJ Symposium Series, vol. 2009(13), pp. 101110 (2009)
7. LIDS, http://www.lids.org/
8. LMbench, http://www.bitmover.com/lmbench/
9. LSM Performance Monitor,
http://www.swlab.cs.okayama-u.ac.jp/lab/yamauchi/lsmpmon/
Abstract. Illegal or unintentional file transmission of important data is a sensitive and main security issue in embedded and mobile devices. Within restricted
resources such as small memory size and low battery capacity, simple and efficient method is needed to lessen much effort for preventing this illegal activity.
Therefore, we discuss a protection technique taking into account these considerations. In our method, sample bytes are extracted from an important file and
then it is used to prohibit illegal file transfer and modification. To avoid attackers easy prediction about the selection position of the sample bytes, it is selected within whole extent of the file by equal distribution and at the random
location. To avoid huge increase of the number of the sample bytes, candidate
sampling area size of the file is chosen carefully after the analysis of the length
and number of files. Also, considering computational overhead to calculate the
number and position of the sample bytes to be selected, we propose three types
of sampling methods. And we will show the evaluation result of these methods
and recommend proper sampling approach to embedded device with low computational power. With the help of this technique, it has advantages that data
leakage can be protected and prohibited effectively and the device can be managed securely within low overhead.
Keywords: sample bytes selection, illegal leakage prevention, embedded device security.
1 Introduction
As embedded devices are getting more popular and widely used, they have had more
powerful functionalities than before. These advanced embedded devices which are
usually called as mobile devices, as the capabilities are being increased, they are not a
simple and additional device anymore; they provide more strong computing power to
deal with various services together: E-mail, business document processing, banking,
and entertainment. And then a user who is using the mobile device often keeps some
important information(ex. users data of contact list and certifications, and so on.) on
its internal storage. If the user didnt pay attention to them, these data might have
leaked by malicious programs such as virus and worm. Usually the leakage is done
through file copying operation from internal to external storage. To solve this problem, many kinds of researches have been done in various areas such as DRM and
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 6873, 2010.
Springer-Verlag Berlin Heidelberg 2010
69
water-marking. But these techniques need not only much computational capabilities
but also additional applications to read or write a modified document containing such
high-valued information. Therefore, it is required a new approach to handle unintentional leakage without additional application and with a minimum cost.
This paper introduces an effective approach to prohibit illegal outflow and shows
how to select a SB(sample bytes) needed to do this job correctly. Our main idea is to
detect un-authorized file copy without pre-modification which is done in the DRM or
water-marking case. The selected and extracted SB is used for distinguishing itself
from other files in the middle of the file copy process. We suggest three kinds of sampling approaches according to how to select the SB and how many they are selected.
And we evaluated those methods to find out what is the best way.
In Section 2 we explain other researches to protect or hide important information.
In Section 3 we describe the strategies of the proposed methods about considerations
to select and the mechanisms. In Section 4 we explain how to evaluate these methods
and show the results. Finally, in Section 5 contains some concluding remarks.
2 Backgrounds
Nowadays, the rapid proliferation of mobile devices and mobile networks has
prompted the need to offer security for them[7,8]. Such as Personal Digital Assistants(PDAs), mobile phones(Smart Phone), laptops, and tablet personal computers(PCs) are classified as mobile devices. To compare non-mobile devices, it has
several special properties include small size memory capability, using battery as a
power, and etc[2,3]. In this environment, it must be considered above limitations to
develop security functions of mobile device.
There have been studies concerned specifically with security enhancement approaches of mobile device as using information hiding techniques[8,9,10]. These
hiding approaches have a pre-modification work doing cryptographic action or inserting hidden markers. The one is well-known for DRM[14] and others are steganography, watermarking and fingerprinting[8,11,12]. But, these methods are required some
modifications on original messages to hide and to insert secrete message in it, and
needed more computational overheads to do this work. Due to this, additional application is needed to access a file from converted message to original message before
using it. Therefore, without the hardware support like special-purpose co-processor
chip, these overheads will be a big burden to a mobile device because it is needed an
intensive mathematical calculations.
There are also other approaches focused on making and using special bytes stream
called a signature or a hash value[2,4,5]. The hash value such as CRC or MD5 is
calculated and generated byte stream after scanning whole message or file. With a bit
or bytes change of original data, the value must be changed and re-generated and then
it has an advantage on detecting update or modification at small amount of variation.
But it is not suitable for detection of copy or transfer operation to external device but
for catching internal file operation such as re-naming. The signature, sometimes
called a detection signature or virus pattern, is subset bytes stream extracted from
some part of whole message or file. In this case, after analyzing network packets or
virus program, the signature is selected and extracted in some region to uniquely
70
distinguish it from other files or packets in the form of original bytes stream. Without
the overheads of crypto-graphical converting or hidden mark creation, we can easily
detect illegal copy through bytes comparing between the signature and the bytes
stream transferred. With the help of this advantage, it is more suitable way than that
of others. As this signature generation is possible for a well-known virus or pattern, it
is difficult to use at the case of not knowing any kinds of information about the file in
advance. Therefore, a new protection method is required to overcome these issues
described above.
71
into several SB sampling regions - called window at each region and select the SB
at each window. In this selection method which we call it as a full sampling(FS), it is
high possibility that there are too many SBs because they are extracted in every window. To prevent this situation, we select the SB not in the whole window but in some
windows. These widows are decided by binomial sampling(BS) and dynamic sampling(DS). As the BS is based on binomial distribution theory[13], there is a run-time
mathematical computational overhead. In the DS, SBs are sampled at the peak positions of the binomial distribution curves generated by mathematical function with
static sampling table(SST) holding pre-computed values of this function.
3.3 Mechanism for SB Selection
As described before, we use three kinds of SB sampling. Although these three methods are similar as a whole, there is difference in selecting windows where SBs are
sampled or extracted. In short, the FS takes SBs in every window and others take
them in a few windows of all. Describing overall process of SB selection, it consists
of setting window size, choosing the window sampled and random sampling in the
window. The size of the window is determined according to the size distributions of
files. After the size determined, SBs are taken at random positions in the selected
windows. In this process, if duplicated SB is taken, the SB has to be re-extracted at
another random position until there is no collision. For this purpose, all extracted SBs
are stored on a SB table(SBT). To describe window choosing process more easily,
assuming that there are 10 windows in a certain file and we are trying to extract SBs
on that file, we describe how to select windows at each methods. To begin with, FS
takes all 10 windows. In the case of the BS, as selected windows are determined by
calculating mathematical function of binomial distribution, from one to 10 windows
are maybe taken. Because many optional parameters affect to the BS, it is very hard to
correctly predict the number of windows. Last, DS selects just only one window
which has maximum SBs among selected windows in the BS methods. For example,
assuming 4 out of 10 windows are selected and there are 2, 5, 4 and 3 SBs at each
selected windows, the DS methods takes second window and extracts five SBs. Although the DS is similar to the BS, it is different that only one SB is extracted from
one out of selected windows and SST is used for the window selection instead of
mathematical computations.
4 Evaluations
We shall show the results of the evaluation of our methods in this section. We evaluated our solution in quantifying the overhead of sampling time at each method
described before. After analysis based on security policy, 43 Files which contain
important information were selected among many files in mobile devices. During the
evaluation, we measure the computational overhead changing the window size from
4K bytes to 4M bytes. All of our measurements were made on a 2.67 GHz Intel PC
with 3GB RAM, running Windows XP.
72
For evaluation, size distribution value of the files must be gathered in advance and
then do tests about overheads at each method. There are two results of file size distributions and execution times in the Fig. 1. According to these distributions, four values
of the window size are chosen from 4K bytes to 4M bytes. The gap of those values is
set to be as 10 times bigger than the previous one start from 4K bytes. As shown in
Fig. 1., FS and BS take much time to get SBs that that of the DS at small window size
but they take similar time as the DS does at the large window size. The smaller case
we choose, the correctness will be higher to identify important file during the file
transfer to external but the number of SBs will be increased. And, FS or BS is efficient in the small case and DS is done well at the large case. As the cost sharply being
increased at 40K, the determination of appropriate window size is recommended
selecting it from 40K to 400K.
5 Conclusions
The SB sampling was proposed to prevent unintentional data leakage in this paper. To
get SBs from important files, three types of methods were introduced according to
their selection strategies, and we showed the mechanisms how they extract the SBs as
changes of the window size and the evaluation results. According to this work, we
found that getting proper SBs is tightly influenced to the window size and the size
must be fixed after analysis of file length distributions. Also, we found that recommended window size is one of the values which are in between average file length and
one tenth of it. After the window size had been decided in this range, the mechanism
to choose the SB was determined one of FS, BS and DS according to computational
power of the device. Through these mechanisms and simple bytes comparing using
SB, the transferring block which has sensitive information will be efficiently blocked.
With the help the simplicity of the method and minimized computational overhead for
SB selection, it is more suitable prevention method for embedded device than others.
Therefore, we are planning on further research about minimum SB sampling method,
dynamic change method of the window size in run-time. And it will be also studied to
ensure high correctness with very small SBs.
73
Acknowledgment
This work was supported by the IT R&D program of MKE/KEIT [10035708, The
development of CPS (Cyber-Physical Systems) core technologies for high confidential autonomic control software].
References
1. Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. J. Mol.
Biol. 147, 195197 (1981)
2. Shi, Z., Ji, Z., Hu, M.: A Novel Distributed Intrusion Detection Model Based on Mobile
Agent. In: ACM InfoSecu04, pp. 155159 (2006)
3. Yong-guang, Z., Wenke, L., Yi-an, H.: Intrusion Detection Technique for Mobile Wireless
Networks. In: ACM MONET, pp. 545556 (2003)
4. Deepak, V.: An Efficient Signature Representation and Matching Method for Mobile Devices. In: Proceedings of the 2nd Annual International Workshop on Wireless Internet,
vol. 220 (2006)
5. Geetha, R., Delbert, H.: A P2P Intrusion Detection System based on Mobile Agents. In:
ACM ACME 2004, pp. 185195 (2004)
6. National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov
7. Yogesh Prem, S., Hannes, T.: Protecting Mobile Devices from TCP Flooding Attacks. In:
ACM mobiArch 2006, pp. 6368 (2006)
8. Benjamin, H.: Mobile Device Security. In: ACM InfoSecCD Conference 2004, pp. 99101
(2004)
9. Ingemar, J., Ton, K., Georg, P., Mathias, S.: Information Transmission and Steganography.
In: Barni, M., Cox, I., Kalker, T., Kim, H.-J. (eds.) IWDW 2005. LNCS, vol. 3710, pp.
1529. Springer, Heidelberg (2005)
10. David, C., Sebastian, H., Pasquale, M.: Quantitative Analysis of the Leakage of Confidential Data. Electronic Notes in Theoretical Computer Science 59(3) (2003)
11. Christian, C.: An Information-Theoretic Model for Steganography. In: Aucsmith, D. (ed.)
IH 1998. LNCS, vol. 1525, pp. 306318. Springer, Heidelberg (1998)
12. Dan, B., James, S.: Collusion-Secure Fingerprinting for Digital Data. IEEE Transactions
on Information Theory 44(5) (September 1998)
13. Binomial Distribution,
http://en.wikipedia.org/wiki/Binomial_distribution
14. Digital Rights Management,
http://en.wikipedia.org/wiki/Digital_rights_management
Abstract. OSGi platform provides Java-based open standard programming interface that enables communication and control among devices at home.
Service-oriented, component based software systems built using OSGi are extensible and adaptable but they entail new types of security concerns. Security
concerns in OSGi platforms can be divided into two basic categories: vulnerabilities in Java cross-platform (or multi-platform) technology and vulnerabilities in the OSGi framework. This paper identifies a new OSGi platform-specific
security vulnerability called a service injection attack and proposes two mechanisms of protection against this newly identified security risk in the OSGi
framework.
Keywords: Service injection, OSGi, Security.
1 Introduction
In todays dynamic service deployment environment, distributing large, fullyintegrated software packages is diminishing. Applications or components provided by
third parties are dynamically and remotely loaded to be deployed into a system [1].
The OSGi [2] framework that resolves code dependencies at run time in addition to
installing and updating components without requiring a reboot plays an important role
in such a runtime extensible component distribution and management environment.
Service-oriented, component based software systems built using OSGi are extensible and adaptable but they entail new types of security concerns. Several research efforts have been devoted security in service-oriented programming environment [3-5].
Security concerns in OSGi platforms can be divided into two basic categories: vulnerabilities in Java cross-platform (or multi-platform) technology and vulnerabilities in
the OSGi framework. Examples of security vulnerabilities in Javas cross-platform
execution environments are a denial-of-service attack that attempts to make a computer
resource (CPU and memory) unavailable to its intended users and a data modification
attack that involves reordering of static variables as well as padding between them. To
defend against such attacks, a mechanism that isolates and manages Java resources per
process is used [6-9].
A recent work in [10] identified 25 security vulnerabilities by analyzing existing open
source OSGi platforms and suggested safeguards against 17 OSGi-specific security
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 7483, 2010.
Springer-Verlag Berlin Heidelberg 2010
75
vulnerabilities. In [11], the use of digital signatures (i.e., digitally signing the bundles and
verifying the signature) was proposed to provide the integrity of vulnerable components.
This paper identifies a new OSGi platform-specific security vulnerability called a service injection attack that was not addressed in [10]. The service injection attack can
arise in the course of component updates. The paper proposes two mechanisms of protection against this newly identified security risk in the OSGi framework.
The rest of this paper is organized as follows. Section 2 describes the principal
structure of OSGi platforms and their security vulnerabilities. Section 3 identifies new
security vulnerability in OSGi that has not been discussed so far and proposes mechanisms for preventing this attack. In Section 4, the proposed mechanisms are implemented using Knopflerfish that is an open source OSGi service platform [12], and
they are compared with other open source OSGi implementations. Finally, conclusions are given in Section 5.
76
2.2
In [10], various security vulnerabilities that can occur in existing open source OSGi
platforms were identified and their countermeasures were suggested. Table 1 shows
attacks that can be performed through malicious OSGi bundles.
Table 1. Vunerabilites and countermeasueres in OSGi
Layer
1
Security Attacks
Countermeasures
Disrupting
normal
bundle Prohibit redundant imports
installation by redundantly importing
packages
Making disk space full by installing Limit the size of a bundle to be
a large-sized bundle.
downloaded
Making the bundle management
utilities inoperable by placing an
infinite loop (the repetition statement
that never terminates) in the start
method of the activator class when
the bundle is started
Disrupting
normal
service
registration by registering too many
services
77
in the service registry resulting from updates of bundle A by retrieving again ServiceReference objects from the service registry. Here, the service that is eventually bound to service requesters is M-S registered by bundle M, not A-S.
As shown in Fig. 1, an attack that replaces a normal service by a malicious service
can occur during bundle updates. This attack is called a service injection attack.
This paper proposes two mechanisms to defend against this attack.
78
79
80
While updating the bundle, the state of the bundle is UPDATING. In the UPDATING state, services are normally provided so there is no change from the perspective of
service requesters. When the JAR file of a new bundle version is downloaded and
installed, the information about the previous version of the bundle along with the priority information of the services registered by the previous version of the bundle is added
to the bundle context. When the state of the newly installed bundle is changed to ACTIVE, services are registered based on the stored service priority information. Once the
state of the updated bundle becomes ACTIVE, the state of the previous version of the
bundle is changed to STOP. That is, the previous version of the bundle continues to
provide its services until the update for a new version is entirely completed. In this
way, the proposed mechanism can prevent malicious services from being injected
during bundle updates. The modified bundle lifecycle is depicted in Fig. 3.
4 Experiments
The proposed protection mechanisms against the service injection attack were implemented by modifying Knopflerfish, an open source OSGi implementation. The implementation of the protection mechanism for signed bundles utilizes SF-Jarsigner
that supports the publication of signed bundles: signature and loading onto a public
repository [11].
SF-Jarsinger checks the validity of the signature of bundles according to the steps
presented below:
1)
2)
3)
4)
If the signature block file, a digest value in the manifest file and a digest value in the
signature file that are examined in the validation of bundle signatures change whenever the bundle is updated, it is not worth checking their validity. This paper assumes
that the authentication information of the signer who signs the bundle is issued by a
certification authority (CA) and it remains unique and identical after bundle updates.
Based on this assumption, the digital signature is used to authenticate the signer
i.e., it is possible to verify that a service is registered by a bundle that is signed by the
original bundle publisher.
81
For implementation of the proposed mechanisms, the BundleImpl class that implements the Bundle interface in Knopflerfish is modified to add the UPDATING
state to the bundle lifecycle. In the conventional bundle update process, the current
version of the bundle is sopped, the bundle is updated, and the updated version is
restarted. In the enhanced update mechanism proposed in this paper, the updated
bundle is re-started and then the previous version of the bundle is stopped. To give
service priorities to the service registry in which ArrayList and HashMap are
used to assemble various OSGi services, new priority property is added to the
ServiceRegistrationImpl class. The Collection.sort function is used
to sort the assembled services.
To compare with the implemented mechanisms, existing open source OSGi implementations, Equinox [13], Felix [14] and Knopflerfish, are evaluated in terms of
their resilience against service injection attacks during bundle updates. For experiments, the following 4 bundles are implemented.
Normal Bundle: A bundle that provides only the service interface.
NormalImpl Bundle: This bundle has service implementations and registers
normal services in the service registry. It returns the good message when its
services are called.
MaliciousImpl Bundle: This bundle has service implementations and registers
malicious services. It returns the malicious message when its services are
invoked.
Target Bundle: This bundle calls services for use. It creates threads in the start
method of the Activator class to call services periodically.
The order of bundle installation is Normal, NormalImpl, MaliciousImpl, and Target.
The execution order is NormalImpl, MaliciousImpl, and Target. When all bundles are
in the ACTIVE state, a request for the NormalImpl bundle update is made. Messages
outputted by services that the Target bundle calls after bundle updates are examined
to determine whether or not existing open source OSGi implementations can defend
against the service injection attack. In Table 2, the proposed mechanisms for preventing the service injection attack are compared with existing open source OSGi implementations.
Table 2. Evaluation againt existing open source OSGi implementations
Platform type
Result
Equinox 3.6
Felix 2.05
Knoflerfish 3.0
Proposed
mechanism
82
value
Subject
Service Injection
Entity
Malicious Bundle
Location
Target
Temporal point
Updating bundle
5 Conclusion
This paper identifies a security threat in the OSGi framework associated with malicious service injections during bundle updates and proposes two mechanisms to defend against this threat. The first protection mechanism distinguishes normal services
from malicious services using the bundle signature information added in service registrations. The other mechanism enhances the conventional bundle update process by
adding the UPDATING state to the bundle lifecycle. The proposed mechanisms are
evaluated in comparison with existing open source OSGi platforms. The experiment
results show that the proposed mechanisms are more resilient against service injection
attacks.
Acknowledgments
This research was supported by the MKE (The Ministry of Knowledge Economy),
Korea, under the ITRC(Information Technology Research Center) Support program
supervised by the NIPA(National IT industry Promotion Agency) (NIPA-2010C1090-1031-0004).
References
1. Royon, Y., Frnot, S.: Multiservice home gateways: business model, execution environment,
management infrastructure. IEEE Communications Magazine 45(10), 122128 (2007)
2. OSGi Alliance. OSGi service platform, core specification release 4.2. release 03 (2010)
3. Binder, W.: Secure and Reliable Java-Based Middleware Challenges and Solutions. In: 1st
International Conference on Availability, Reliability and Security. ARES, pp. 662669.
IEEE Computer Society, Washington (2006)
4. Parrend, P., Frenot, S.: Classification of component vulnerabilities in Java service oriented
programming platforms. In: Chaudron, M.R.V., Ren, X.-M., Reussner, R. (eds.) CBSE
2008. LNCS, vol. 5282, pp. 8096. Springer, Heidelberg (2008)
5. Lowis, L., Accorsi, R.: On a classification approach for SOA vulnerabilities. In: Proc.
IEEE Workshop on Security Aspects of Process and Services Eng (SAPSE). IEEE Computer Press, Los Alamitos (2009)
83
6. Czajkowski, G., Dayns, L.: Multitasking without compromise: a virtual machine evolution. In: Proceedings of the Object Oriented Programming, Systems, Languages, and Applications Conference, Tampa Bay, USA, pp. 125138. ACM, New York (2001)
7. Geoffray, N., Thomas, G., Folliot, B., Clement, C.: Towards a new Isolation Abstraction
for OSGi. In: Engeland, M., Spinczyk, O. (eds.) The 1st Workshop on Isolation and Integration in Embedded Systems, IIES 2008, pp. 4145. ACM, New York (2008)
8. Gama, K., Donsez, D.: Towards Dynamic Component Isolation in a Service Oriented Platform. In: Lewis, G.A., Poernomo, I., Hofmeister, C. (eds.) CBSE 2009. LNCS, vol. 5582,
pp. 104120. Springer, Heidelberg (2009)
9. Geoffray, N., Thomas, G., Muller, G., Parrend, P., Frenot, S., Folliot, B.: I-JVM: a Java
Virtual Machine for Component Isolation in OSGi. Research Report RR-6801, INRIA
(2009)
10. Parrend, P., Frnot, S.: Security benchmarks of OSGi platforms: toward hardened OSGi.
Software: Practice and Experience 39(5), 471499 (2009)
11. Parrend, P., Frenot, S.: Supporting the secure deployment of OSGi Bundles. In: First IEEE
WoWMoM Workshop on Adaptive and Dependable Mission and Business Critical Mobile
Systems, Helsinki, Finland (2007)
12. Knopflerfish OSGi - Open Source OSGi service platform,
http://knopflerfish.org/
13. Equinox, http://www.eclipse.org/equinox
14. Apache felix, http://felix.apache.org/site/index.html
15. Howes, T.: The String Representation of LDAP Search Filters. IETF RFC, Network Working Group, Request for Comments: 2254 (1997)
16. Sun Microsystems Inc., JAR file specification. Sun Java Specifications (2003),
http://java.sun.com/j2se/1.5.0/docs/guide/jar/jar.html
Introduction
The number of computers connected to network has increased with the widespread
use of the Internet. In addition, the number of reports of software vulnerabilities
has been increasing every year. This increase in the number of incidents of software
vulnerability can be attributed to the widespread use of automated attack tools
and the increasing number of attacks against systems connected to the Internet
[1]. Therefore, various defense mechanisms against such attacks have been studied
extensively, and these studies have gained a lot of attention.
Various defense mechanisms include rewalls, an Intrusion Detection System (IDS) [2]; buer overow protection and access control mechanisms such as
Mandatory Access Control (MAC) and Role Based Access Control (RBAC) [3];
and secure OS are examples of such defense mechanisms.
The secure OS [4] has been the focus of several studies. In particular, SecurityEnhanced Linux (SELinux) has become of major interest. Even if the authority
is taken, secure OS makes the range of the inuence a minimum. However, the
CPU resource, which is an important resource for executing a program, is not
the object of the access control. As a result, such OSes cannot control the CPU
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 8493, 2010.
c Springer-Verlag Berlin Heidelberg 2010
85
usage ratio. For example, a secure OS cannot prevent attackers from carrying
out DoS attacks, which aect the CPU resources. In general, the OSes can only
limit the maximum CPU time for each process and not the proportion of CPU
time allocated to the processes.
In an earlier study, we proposed a new type of execution resource that controls the maximum CPU usage such that the abuse of CPU resources can be
prevented [5,6]. In order to prevent the abuse of the CPU resources, we propose
an execution resource that can limit the upper bound of CPU usage. The previously proposed mechanism can control only one process at a time. Because,
most of services involve multiple processes, the mechanism should control all
the processes involved in each service. In this paper, we propose an improved
mechanism for achieving a bound on execution performance of a process group,
in order to limit unnecessary processor use. The proposed mechanism is based
on a previously proposed mechanism. The proposed mechanism introduces execution tree which deploy the upper bound of execution resource for the nodes
of execution tree.
Execution Resource
In this section, we explain the concept of execution resource on the basis of the
presentation in previous papers [5,6].
2.1
Overview
There are two types of execution resources. One is execution with performance
and the other is execution with priority.
86
time-block
time-slot
Fig. 1. Time slots and a time block
A0
A1a
(50%)
A2a
(40%)
B2a
(30%)
A3a
(60%)
B3a
(40%)
A-1
C2a
(20%)
A-4
Service A
B1a
(20%)
C1a
(6)
D2a
(10%)
A-2
A-3
Service A
87
A0
A1a
(50%)
A2a
(40%)
B2a
(30%)
B1a
(20%)
C2a
(6)
A-1
D2b
(50%)
A-2
E2b
(4)
C1a
(6)
F2b
(6)
A-3
Service A
Fig. 3. Two process group executions
represents the degree of CPU usage for a process group. A leaf is called leaf
execution; every leaf execution is linked to a process.
The total assigned CPU time for leaf executions equals the assigned CPU time
for the parent directory execution. The degree of CPU usage for leaf executions
indicates the priority or a ratio (%) to assigned CPU time of parent directory
execution. In the leaf execution, the ratio corresponds to the point where the
parent directory execution is dened as 100%. The depth of an execution tree
is greater than one. As a result, it is possible to create a process group within
another process group.
Fig. 3 shows a case where more than one execution is linked to a process group.
When the second execution (B1a ) is linked to a process group, leaf executions
(D2b , E2b , F2b ) have to be created and linked to each process (A, B, C) in the
process group. As a result, each process within the process group is linked to
two leaf executions.
2.4
3
3.1
88
Form
creat execution
(mips)
delete execution
(execid)
attach execution
(execid, pid)
detach execution
(execid, pid)
wait execution
(pid, chan)
wakeup execution
(pid, chan)
dispatch(pid)
control execution
(execid, mips)
Contents of operation
Create the execution specified by mips and return the execution
identifier execid. When mips is between 1 and 100 it signifies the
performance regulation execution degree (as a percentage with
the performance of the processor itself taken to be 100 percent),
when it is 0 or negative is signifies the priority of the execution
degree (the absolute value is the process priority).
Delete the execution execid.
Associate the execution execid and the process pid.
Remove the association between execution execid and process
pid.
Forbid the assignment of processor [time] to process pid and its
associated execution[s]; this puts the process in the WAIT state.
Make it possible to assign CPU time to the process pid and its
associated executions; this puts the process in the READY state.
Run process pid.
Change the execution degree of execid to mips. mips is interpreted as in creat execution.
two or more programs that demand innite CPU time run simultaneously. In
this case, the performance of the service deteriorates signicantly.
To prevent the abuse of the CPU resources, we propose an execution resource
that helps to achieve an upper bound for the CPU usage ratio. In this execution
resource, the CPU time is allocated according to the priority until the usage
reaches a specied ratio in a time slice. When it reaches the specied ratio, the
state of the currently running process is changed to a WAIT state until the
current time slice expires. Even if a process that is linked to an execution for
which the CPU usage is limited by an upper bound suers a malicious attack,
the execution system can prevent the program from using excessive CPU time.
Moreover, the execution resource can be grouped with a user or a service. Therefore, the CPU usage ratio of a user or a service can be specied. As a result, the
impact of a DoS attack can be controlled within the process group even if a new
child execution is created, because the execution belongs to the same group.
As described in a previous paper [5,6], we can guarantee that the important
processes will be carried out eectively by using the execution resources with a
good performance.
3.2
In the previous mechanism, the execution resource with an upper bound was
a leaf execution. Thus, the previous mechanism could be used only to control
a process. We introduce directory execution as an execution resource with an
89
Timer interrupt
no
yes
found
no
(4) Directory
execution?
yes
(5) Search
execution
success
failure
(6) Run
upper bound. In order to do so, the process scheduler was changed for the control
of an execution resource with upper bound of directory execution.
The process ow of the new process scheduler is depicted in Fig. 4 and Fig.
5. Fig. 4 shows the process ow of the process scheduler. The search performed
by the process scheduler for the execution resource has the highest priority. If
directory execution is selected in step (4) of the process, (Fig. 4), the process
scheduler searches the leaf executions of the directory execution. Fig. 5 shows
the process ow of the process scheduler when the directory execution is the
execution resource with an upper bound. If a leaf execution resource is assigned
a CPU time slot, the process illustrated in Fig. 5 is successfully completed.
Evaluation
We investigated whether the proposed method can control upper bound of the
CPU resources for the services. We performed a basic evaluation and an evaluation for a case involving an attack.
4.1
Basic Evaluation
An execution tree was constructed before the evaluation. This execution tree
included three process groups (services A, B, and C). Each process group involved three processes. Table 2 shows the performance and priority of execution
resource of each process group in the execution tree. The execution resource
90
not found
yes
(4) Change to a SUSPEND state
no
(5) counter ==
time-slice number ?
no
yes
(6) Change to a READY state,
reset the counter,
remove from the top of the priority queue,
and insert at the end of the queue.
SUSPEND
RUN or
READY
Fig. 5. Process flow when the directory execution with an upper bound is found
We evaluated the processing time of a normal service A (SA) and an attack service B (SB). SB tries to obtain as much CPU time as possible. In this evaluation,
SB was attached to the directory execution with an upper bound. Fig. 7 shows
the behavior of the processing time as the number of processes in SB changes.
The processing time of each service is plotted on the y-axis, and the number of
processes in SB is plotted on the x-axis. The processing time of SA is constant
because the upper bound of SB is restricted by the directory execution with an
upper bound.
91
case
1
2
3
4
5
6
Service A Service B
exec 1
exec 2
6
6, MAX 100%
6
6, MAX 100%
6
6, MAX 100%
6
6, MAX 100%
6
6, MAX 50%
6
6, MAX 50%
Service A
Service B
Service C
exec 3
6, MAX 100%
6, MAX 75%
6, MAX 50%
6, MAX 25%
6, MAX 50%
6, MAX 25%
Service C
case1
case2
exec1-1
exec1-2
exec1-3
exec2-1
exec2-2
exec2-3
exec3-1
exec3-2
exec3-3
case3
case4
case5
case6
0%
20%
40%
60%
80%
100%
Fig. 8 shows the processing time of each service when the upper bound of the
execution resource attached to SB is changed. The upper bound was increased
from 25% to 100%. As the proposed mechanism restricted the deterioration in
the performance of SB, the processing time of SA decreased. These results show
that the proposed mechanism can restrict CPU abuse caused by a malicious
service.
Related Work
92
SA
SB
2,000
em
it
gn1,500
is
se
co1,000
rP
500
0
2
3
4
Number of processes in SB
3000
2500
e
m
i
t 2000
g
n
i 1500
s
s
e
c
o 1000
r
P
SA
SB
500
0
25%
50%
75%
100%
Upper bound of execution resource attached
to SB
Fig. 8. Processing time when the upper bound of the execution resource attached to
SB is changed
93
Execution resources with upper bounds are classied under resource accounting techniques [10]. The execution resource can control the maximum extent of
CPU usage of programs for preventing abuse of CPU resources. The policy of
rate limiting can be enforced for a CPU resource by using an access control
mechanism for the execution resource. In addition, the proposed access control
model can be applied to general OSes and secure OSes.
Conclusion
References
1. CERT/CC Statistics (1988-2005), http://www.cert.org/stats/
2. Sekar, R., Bendre, M., Bollineni, P., Dhurjati, D.: A Fast Automaton-Based Method
for Detecting Anomalous Program Behaviors. In: Proc. of IEEE Symposium on
Security and Privacy, pp. 144155 (2001)
3. Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-Based Access Control Models. IEEE Computer 29(2), 3847 (1996)
4. Security-Enhanced Linux, http://www.nsa.gov/selinux/
5. Tabata, T., Hakomori, S., Yokoyama, K., Taniguchi, H.: Controlling CPU Usage
for Processes with Execution Resource for Mitigating CPU DoS Attack. In: 2007
International Conference on Multimedia and Ubiquitous Engineering (MUE 2007),
pp. 141146 (2007)
6. Tabata, T., Hakomori, S., Yokoyama, K., Taniguchi, H.: A CPU Usage Control
Mechanism for Processes with Execution Resource for Mitigating CPU DoS Attack.
International Journal of Smart Home 1(2), 109128 (2007)
7. Garg, A., Reddy, A.: Mitigation of DoS attacks through QoS regulation. In: IEEE
International Workshop on Quality of Service (IWQoS), pp.4553 (2002)
8. Spatscheck, O., Petersen, L.L.: Defending Against Denial of Service Attacks in
Scout. In: 3rd Symp. on Operating Systems Design and Implementation, pp. 59
72 (1999)
9. Banga, G., Druschel, P., Mogul, J.C.: Resource containers: A new facility for resource management in server systems. In: The Third Symposium on Operating
Systems Design and Implementation (OSDI 1999), pp. 4558 (1999)
10. Mirkovic, J., Reiher, P.: A taxonomy of DDoS attack and DDoS defense mechanisms. ACM SIGCOMM Comput. Commun. Rev. 34(2), 3953 (2004)
1 Introduction
Nowadays, Internet which can connect the whole world is the ocean of information as
well as an important communication media that makes human life convenient. However, Internet protocol has implemented simple so it has many vulnerabilities about
defending several attacks because of its characteristic. In these days, there are many
attacks such as attacking an anonymous server or an arbitrary node to decrease network resources or system performance using vulnerabilities of Internet protocol. Especially, DoS or DDoS attack is a major threaten and the most difficult problem to
solve among many attacks. The purpose of DoS/DDoS attack is to shutdown an arbitrary node or specific server such as DNS (Domain Name Server) to interrupt normal
services. In this case, attackers send enormous packets to a victim to exhaust the network resources or shutdown the target system.
As mentioned above, the targets of DoS/DDoS attack are a network bandwidth and
system resources. In case of system attack, there are three kinds of attacks such as TCP
SYN attack, UDP flood attack and ICMP flood attack[1]. TCP SYN attack makes halfopen TCP connection to make the victim wait a certain time after sending SYN/ACK to
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 94103, 2010.
Springer-Verlag Berlin Heidelberg 2010
95
the attacker. In case of UDP flood attack, an attacker sends UDP packets to random
ports of a victim and then a victim makes a decision that there are no specific applications. After this procedure, a victim sends an ICMP destination unreachable message to
the attacker. In case of ICMP flood attack, an attacker uses ICMP echo packets which
are in use to check whether it is alive and then a victim sends an ICMP_ECHO_REPLY
to the attacker. All these cases, the victim sends the message such as SYN/ACK, ICMP
destination unreachable and ICMP_ECHO_REPLY to the attacker, which has a spoofed
IP address, so that a victim cannot find the real origin of the attacker.
The purpose of network attack is to make the communication channel of ISP network congest to interrupt normal services through generating packet loss. In this case,
the attacker sends a large stream of data to ISP network. ISP cannot distinguish attack
packets from normal packets so ISP drops these two kinds of packets with same probability. If the amount of attack packets increases, packet loss of normal packets also
increases then ISP lose its service availability.
In the case of DDoS attack, it is more difficult to prevent than DoS attack because
there are several distributed attackers. Moreover, it is very difficult to find a real origin of attackers because DoS/DDoS attacker uses spoofed IP addresses.
DoS/DDoS attacks classify into 5 categories according to attack types such as network device level attack, OS level attack, application level attack, data flooding attack
and protocol features based attack. There are various defense mechanisms according
to attack types. However, we focus on IP traceback after attacking to find a real origin
of attacker in this paper.
This paper explains in detail a new algorithm to find a real origin of attacker in case
of DoS/DDoS attack environment. The rest of this paper is organized as follows.
Chap 2 describes related work and background. Chap 3 describes the probabilistic
route selection traceback algorithm we propose. Chap 4 shows the validation of the
PRST algorithm, and Chap 5 presents the conclusions.
2 Related Works
IP traceback belongs to intrusion response among attack defense mechanisms [1].
Once detecting attack, next step is to find the attack origin and block the attack traffic.
In present, manual control is in use to block attack traffic. IP traceback is a method to
trace attack origin and up to now many traceback algorithms have been proposed.
IP traceback algorithms classify into 6 categories [1].
Bellovin [2] proposed an ICMP traceback technique. In this scheme, when a router
receives a packet to a destination, the router generates an ICMP traceback message,
called an iTrace packet, with low probability. The iTrace packet contains the address
of the router, and is sent to the destination. The destination can reconstruct the route
that was taken by the packets in the flow. In the ICMP traceback, the router has capability to process iTrace message and the performance is varied with sampling probability.
Burch and Cheswick [3] proposed a link testing traceback technique. In this
scheme, it makes a route map of every network from victim and generates burst traffic
each link using UDP chargen service. If the loaded link is located on the attack path,
packet loss probability increases. This makes perturb the attack traffic so we find the
96
real source. This scheme requires considerable knowledge of network topology and
the ability to generate huge traffic in any network link. Moreover, generating bursty
traffic can degrade the network performance itself.
Stone [7] proposed an overlay network architecture, called a CenterTrack, to overcome limitation of link testing traceback. As a CenterTrack method is an overlay network based traceback algorithm, tracking router module is installed in the network
additionally. When an attack occurs, a CenterTrack method forwards traffic information which is received from network edge router to tracking router using tunneling
mechanism. Tracking router can reconstruct an attack path using packet information
which is collected by a CenterTrack. In this scheme, tracking router cannot manage a
whole network so it suits a small size network and has disadvantage that cannot apply
to heterogeneous network environment.
Savage et al. [4] proposed probability packet marking technique, called a PPM. In
this scheme, the router samples packet that traverse the router with an arbitrary probability p and marks its IP address into the incoming packet and sends this packet to a
destination. When an attack occurs, this scheme reconstructs a real attack path using
router IP address information that is recorded in the packet. This scheme has a problem that many packets are required to reconstruct an attack path.
Snoeren et al. [5] proposed hash based IP traceback technique. Hash based IP
traceback divides the whole network into several groups and each group has SCAR
agent that can manage the sub network and traceback an attack path. In this case, each
sub group has to have a SPIE, SCAR and DGA system additionally and a SCAR
needs a memory to store and manage hash value of packets periodically.
Wang et al. [6] proposed an active network based intrusion response framework
which is called Sleepy watermark trace. Sleepy Watermark Tracing (SWT) performs
traceback as to insert a watermark into response packet corresponding with an attack.
The SWT architecture consists of two components, the SWT guardian gateway and
the SWT guarded host. Each SWT guarded host has a unique SWT guardian gateway,
and it maintains a pointer to its SWT guardian gateway. Each SWT guardian gateway
may guard one or more SWT guarded hosts. SWT method operates as follows. When
an attack occurs, intrusion detection system in the guarded host detects an attack.
When an attack is detected, the sleepy intrusion response module of SWT subsystem
in guarded host is operated and then SWT marks in response packet corresponding
with incoming packet at host by watermark enabled application. SWT sends response
packet with watermark. Once starting traceback, SWT finds a watermark inserted
packet cooperated with active tracing module in the guardian gateway. In this scheme,
it is possible to traceback faster because of using response packet corresponding with
an attack packet to find a real origin of attacker. However, it has a big problem to apply real Internet environment because this scheme requires a watermark enabled application. Moreover, there is a disadvantage that cannot trace a real origin of attacker
if an attacker uses an encrypted communication link.
3 Problem Solution
In this paper, we propose a probabilistic route selection traceback algorithm to trace
the real origin of the attacker. In this scheme, we need three kinds of requirements as
below.
1.
2.
3.
97
The link status on attack paths will be changed when attacks occur.
The intermediate router records the number of the incoming packets and outgoing packets traverses each network interface card and calculates the Poisson
packet forwarding probability all the time during running router.
Probabilistic packet forwarding table has to be stored in the router.
98
the edge router checks cumulative Poisson probability of the row of the interface
number, which the agent packet comes from, in the probabilistic packet forwarding
table. After this, the edge router chooses a column number of a field that has the highest Poisson probability among the rows Poisson probability value, which is the outgoing interface of the agent packet, to forward this agent packet to next hop router.
And then, the edge router forwards this agent packet to next hop router through that
interface by PRST algorithm not by the destination IP address. The intermediate
router that receives the agent packet performs the same procedure. The agent packet
finally reaches the attackers edge so we can find the origins of attackers refer to this
network traffic variation and the probabilistic packet forwarding table as mentioned
above. If there are same Poisson probabilities more than one, which has the highest
value, the router copies the agent packet and forwards the agent packet through several interfaces simultaneously. The operation of edge router of the attacker is described in subsection 3.3 in details.
Equation 2 represents the cumulative Poisson packet forwarding probability from
ith incoming interface to oth outgoing interface. io represents a mean of number of
packets forwarded based on Poisson distribution from ith incoming interface to oth
outgoing interface. kio represents a number of packets that forwarded from ith incoming interface to oth outgoing interface. Pio represents the cumulative Poisson packet
forwarding probability from ith interface to oth interface. For example, the cumulative
Poisson packet forwarding probability from 1st incoming interface to 2nd outgoing
interface is equal to P12 .
Pio ( x = kio ) =
e io io
kio !
kio
....(2)
In the equation 2, x means the random variable indicated that a number of packets
that forwarded from ith incoming interface to oth outgoing interface.
Every router has its probabilistic packet forwarding table and the format of the table is as Figure 1.
99
100
101
Let the total number of incoming packets to a certain router be K in normal network condition. K packets forwarded through several j interfaces by routers. The
number of packets forwarded through each interface except incoming interface is
ki1 , ki 2 , ki 3 ,... respectively and the total number of packets is
Pinc =
Pinc is as equation 3.
e i 3 i 3 i 3 ki 3 e i 3 i 3 i 3
k ! k=0 k ! ....................................(3)
ki 3 = 0
i3
i3
i3
ki 3 + d
The cumulative Poisson probability of interface 3 increases more than the other interfaces cumulative Poisson probability. This probability is stored in the probabilistic
packet forwarding table. If the victim sends the agent packet to its edge router, this
agent packet comes into interface 3 and the router forwards this packet to next hop
router referring to probabilistic packet forwarding table.
Generally, Poisson distribution has a characteristic that the probability
P ( x = k ) increases with k from 0 till k and falls off beyond .
In case of
to
. If
102
io + d io
kio =0
io k
kio !
io
io + io
>
kio =0
io k
io
kio !
.. (4)
To detect attacks, intrusion detection algorithm has the threshold of equation 5[8].
Tio = io + n io (5)
In equation 5,
io
tion of traffic volume. If the measured traffic volume( kio ) is larger than
Tio , intru-
sion detection system can regard that DoS based attack occurs. In this case, if n=3 and
kio > Tio , then cumulative Poisson probability has almost 1.
In this manner, the Poisson packet forwarding probability of the interface on the attack path is the highest value in the probabilistic packet forwarding table.
Therefore, the agent packet can be forwarded to the attackers real origin refer to
the probabilistic packet forwarding table.
5 Conclusion
We propose the probabilistic route selection traceback algorithm using an agent
packet, a reply agent packet, probabilistic packet forwarding table and Poisson distribution in this paper. Using PRST algorithm, we can find the attackers real origin by
using a few packets, some agent packets and reply agent packet.
The PRST algorithm runs on the distributed routers refer to probabilistic packet
forwarding table (PPFT). This table is stored and managed by routers. Moreover, we
validate our PRST algorithm by using mathematical approach based on Poisson distribution which has characteristic that if k> then the cumulative Poisson probability
has always higher value than normal cases such as k value is near . Therefore, the
cumulative Poisson packet forwarding probability of the attacked interface has always
higher value than the cumulative Poisson packet forwarding probability of the other
interfaces after attack occurs.
Using this character, the agent packet can be forwarded to the attackers edge
router and that is the attackers real origin.
The PRST algorithm has 3 major contributions as below.
The PRST algorithm is more efficient in terms of system modification, finding origin of attackers and the number of packets to find the origin of attacker. First of all,
this algorithm can be implemented by S/W, not H/W, so this algorithm can be implemented on the intermediate routers without changing H/W. Second, there are many
103
attack paths due to the characteristic of DDoS attack so attack path is less important
than finding origin of attackers. Therefore, we focus on finding origin of attackers in
this paper.
Third, PRST algorithm needs not many packets to find the origin of attackers as
using an agent packet and a reply agent packet.
Acknowledgement
This research was supported by the MKE(Ministry of Knowledge Economy), Korea,
under the ITRC(Information Technology Research Center) support program supervised by the NIPA(National IT Industry Promotion Agency) (NIPA-2010-(C10901031-0005)).
References
[1] Douligeri, C., Serpanos, D.N.: Network Security. IEEE Press, Los Alamitos (2007)
[2] Bellovin, S.: The ICMP traceback message, Network Working Group, Internet draft
(March 2000)
[3] Burch, H., Cheswickk, H.: Tracing anonymous packets to their approximate source. In:
Proceedings of USENIX LISA Conference, pp. 319327 (2000)
[4] Savage, S., Wetherall, D., Karlin, A., Anderson, T.: Network Support for IP traceback.
IEEE/ACM Transactions on Networking, 226237 (2001)
[5] Snoeren, A.C., Partridge, C., Sanchez, L.A., Jones, C.E., Tchakountio, F., Kent, S.T.,
Strayer, W.T.: Hash-based IP Traceback. In: Proceedings of the ACM SIGCOMM 2001
Conference on Applications, Technologies, Architectures and Protocols for Computer
Communication, pp. 314. ACM Press, New York (2001)
[6] Wang, X., Reeves, D.S., Wu, S.F., Yuill, J.: Sleepy watermark tracing: An active networkbased intrusion response framework. In: Proceedings of the Sixteenth International Conference of Information Security (IFIP/SEC_ 2001), Paris (June 2001)
[7] Stone, R.: CenterTrack: An IP overlay network for tracking DoS floods. In: Proceedings of
the Ninth USENIX security symposium, pp. 199212 (2000)
[8] Lee, J., Yoon, M., Lee, H.: Monitoring and Investigation of DoS Attack. KNOM
Reveiw 6(2), 3340 (2004)
Abstract. Digital Watermarking plays an important role for copyright protection of multimedia data. This paper proposes a new watermarking system in
frequency domain for copyright protection of digital audio. In our proposed watermarking system, the original audio is segmented into non-overlapping
frames. Watermarks are then embedded into the selected prominent peaks in the
magnitude spectrum of each frame. Watermarks are extracted by performing the
inverse operation of watermark embedding process. Simulation results indicate
that the proposed watermarking system is highly robust against various kinds of
attacks such as noise addition, cropping, re-sampling, re-quantization, MP3
compression, and low-pass filtering. Our proposed watermarking system outperforms Cox's method in terms of imperceptibility, while keeping comparable
robustness with the Cox's method. Our proposed system achieves SNR (signalto-noise ratio) values ranging from 20 dB to 28 dB, in contrast to Cox's method
which achieves SNR values ranging from only 14 dB to 23 dB.
Keywords: Copyright Protection, Digital Watermarking, Sound Contents, and
Fast Fourier Transform.
1 Introduction
Digital watermarking has drawn extensive attention for copyright protection of multimedia data. A digital audio watermarking is a process of embedding watermarks into
audio signal to show authenticity and ownership. Audio watermarking should meet
the following requirements :(a) Imperceptibility: the digital watermark should not
affect the quality of original audio signal after it is watermarked; (b) Robustness: the
embedded watermark data should not be removed or eliminated by unauthorized
distributors using common signal processing operations and attacks; (c) Capacity:
capacity refers to the numbers of bits that can be embedded into the audio signal
within a unit of time; (d) Security: security implies that the watermark should only be
detectable by the authorized person. These requirements are often contradictory. Since
robustness and imperceptibility are the most important requirements for digital audio
watermarking, these should be satisfied first.
* Corresponding author.
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 104113, 2010.
Springer-Verlag Berlin Heidelberg 2010
105
In this paper, we propose a new watermarking system in frequency domain for audio copyright protection. The watermarks are embedded into the selected prominent
peaks of the magnitude spectrum of each non-overlapping frame. Experimental results
indicate that the proposed watermarking system provides similar robustness as Coxs
method [7] against several kinds of attacks such as noise addition, cropping, resampling, re-quantization, MP3 compression and low-pass filtering. However, our
proposed watermarking system outperforms Cox's method in terms of imperceptibility. Our proposed system achieves SNR values ranging from 20 dB to 28 dB, in contrast to Cox's method which achieves SNR values ranging from only 14 dB to 23 dB.
The rest of this paper is organized as follows. Section 2 provides a brief description
of related research including Cox's method. Section 3 introduces our proposed watermarking system including watermark embedding process and watermark detection
process. Section 4 compares the performance of our proposed system with Cox's
method in terms of imperceptibility as well as robustness. Section 5 concludes this
paper.
2 Related Research
A significant number of techniques that create robust and imperceptible audio watermarks have been reported in recent years. Lie et al. [1] proposed a method of embedding watermarks into audio signals in the time domain that exploits differential
average-of-absolute-amplitude relationships within each group of audio samples to
represent single bit of information. It also utilizes a low-frequency amplitude modification technique to scale amplitudes in selected sections of samples so that the time
domain waveform envelope can be well-preserved. In [3], the authors propose a blind
audio watermarking system that embeds watermarks into the audio signal in the time
domain. The strength of audio signal modifications is limited by the necessity to produce an output signal for watermark detection. The watermark signal is generated
using a key, and watermark insertion depends on the amplitude and frequency of an
audio signal that minimizes the audibility of the watermark signal. Ling et al. [4]
introduce a watermarking scheme based on nonuniform discrete Fourier transform
(NDFT), in which the frequency points of the embedding watermark are selected by a
secret key. Zeng et al. [5] describe a blind watermarking system that embeds watermarks into discrete cosine transform (DCT) coefficients by utilizing a quantization
index modulation technique. In [6], the authors propose a watermarking system that
embeds synchronization signals in the time domain to resist several types of attack.
Pooyan et al. [7] introduce an audio watermarking system that embeds watermarks in
the wavelet domain. The watermarked data is then encrypted and combined with a
synchronization code and embedded into low frequency coefficients of the sound in
the wavelet domain. The magnitude of the quantization step and its embedding
strength are adaptively determined according to the characteristics of the human auditory system.
In Cox's method [8] watermarks are embedded into the highest n DCT coefficients
of a whole sound excluding the DC component according to the following equation:
vi = vi (1 + xi )
(1)
106
where vi is a magnitude coefficient into which a watermark is embedded, xi is a watermark to be inserted into vi, is a scaling factor, and vi' is an adjusted magnitude coefficient. The watermark sequence is extracted by performing the inverse operation of (1)
represented by the following equation:
xi* = (
vi*
1) /
vi
(2)
Cox's method provides good results in terms of robustness. However, this method
cannot achieve good imperceptibility in terms of signal-to-noise ratio (SNR) because
it embeds watermark into highest DCT components of the sound which sometimes
affects the quality of the sound. To overcome this problem, we propose a new watermarking system which embeds watermarks into the selected prominent peaks of the
magnitude spectrum of each non-overlapping frame. This provides better results than
Cox's method in SNR aspect for watermarked audio signals, while keeping comparable robustness with Cox's method against several kinds of attacks.
107
0 .2 5
0 .2
0 .1 5
0 .1
0 .0 5
0
-0 .0 5
-0 .1
-0 .1 5
-0 .2
-0 .2 5
0
0 .5
1 .5
2 .5
3 .5
4 .5
-2
-4
500
1000
1500
2000
2500
3000
3500
4000
Watermark
0 .2 5
0 .2
0 .1 5
0 .1
0 .0 5
0
-0 .0 5
-0 .1
-0 .1 5
-0 .2
-0 .2 5
0
0 .5
1 .5
2 .5
3 .5
4 .5
108
0 .5
1 .5
2 .5
3 .5
4 .5
Pre-embedding
process
0 .0 5
0
-0 .0 5
-0 .1
-0 .1 5
-0 .2
-0 .2 5
0
0 .5
1 .5
2 .5
3 .5
4 .5
Attacked watermark
audio signal
-2
-4
500
1000
1500
2000
2500
3000
3500
4000
Extracted watermark
By considering the frame size of 512 samples, we have 403 frames for each audio sample. From each frame we detect 10 peaks to embed watermark. Thus, the length of the
watermark sequence is 10403=4,030.
In order to evaluate the performance of the proposed watermarking system, the correlation coefficient between the original watermark X and the extracted watermark X*
is calculated by the following similarity SIM(X, X*) formula:
SIM ( X , X * ) =
X X*
(3)
X*X*
109
Original Signal
1
0
Magnitude
-1
0.5
1.5
0.5
1.5
0.5
1.5
2.5
3
(a)
Watermarked Signal
3.5
4.5
3.5
4.5
3.5
4.5
1
0
-1
2.5
3
(b)
Difference Signal
1
0
-1
2.5
(c)
Time (sec)
Fig. 3. Imperceptibility of the watermarked audio using the proposed method: (a) original
human voice sound, (b) watermarked human voice sound (c) difference between original and
watermarked human voice sound
In order to evaluate the quality of watermarked signal, the following signal-tonoise ratio (SNR) equation is used:
SNR = 10 log10
N
n =1
N
n =1
S 2 ( n)
S (n) S * (n)
(4)
where S(n) and S*(n) are original audio signal and watermarked audio signal respectively. After embedding watermark, the SNR of all selected audio signals using the
proposed method are above 20 dB which ensures the imperceptibility of our proposed
system.
Figure 4 shows the peak detection of first frame of human voice sound. In our proposed system, watermarks are embedded into the selected prominent peaks of the
magnitude spectrum of each frame which provides high robustness against different
kinds of attacks as well as good SNR values for watermarked audio signals.
In Coxs method, on the other hand, watermarks are embedded into the higher
DCT coefficients of the whole sound excluding the DC component. Table 1 shows the
SNR comparison between the proposed watermarking system and Coxs method for
different values of . Our proposed system achieves SNR values ranging from 20 dB
to 28 dB for different watermarked sounds. This is in contrast to Cox's method which
achieves SNR values ranging from only 14 to 23. In other words, our proposed watermarking system provides 6 dB higher SNR values than Coxs method for different
watermarked sounds. Thus, our proposed watermarking system outperforms Cox's
method in terms of imperceptibility.
110
Peak detection
90
80
70
Magnitude
60
50
40
30
20
10
0
0.5
1
1.5
Frequency (Hz)
2.5
4
x 10
SIM
Proposed
64.6345
64.4371
64.5267
64.2483
Cox
64.9933
64.9933
64.9932
64.9933
111
Description
Additive white Gaussian noise (AWGN) is added with the watermarked
Noise addition
audio signal.
We removed 10% samples from the beginning of the watermarked signal
Cropping
and then replaced these samples by the original signal.
The watermarked signal originally sampled at 44.1 kHz is re-sampled at
Re-sampling
22.050 kHz, and then restored by sampling again at 44.1 kHz.
The 16 bit watermarked audio signal is quantized down to 8 bits/sample
Re-quantization
and again re-quantized back to 16 bits/sample.
MP3
MPEG-1 layer 3 compression with 128 kbps is applied to the waterCompression marked signal.
Low-pass
The low-pass filter used in this study is a second order Butterworth filter
filtering
with cut-off frequency 10 kHz.
Figures 5 and 6 show the response of the watermark detector to 1000 randomly
generated watermarks where correct watermark is at the 500th position against Gaussian noise attack for =0.3 using the proposed system and Cox method.
Proposed Method
60
60
40
40
20
20
0
0
500
(a)
1000
60
60
40
40
20
20
500
(b)
1000
500
(d)
1000
0
0
500
(c)
1000
Random Watermark
Fig. 5. Watermark detector response against Gaussian noise attack using the proposed method:
(a) Let it be, (b) Symphony No. 5, (c) Hey Jude, (d) Human Voice
Table 4 shows the similarity results of the proposed scheme and Coxs method in
terms of robustness against several kinds of attacks applied to four different types of
watermarked audio signal Let it be, Symphony No 5, Hey Jude, and human
voice respectively for =0.3.
Overall, our proposed watermarking system outperforms Cox's method in terms of
SNR values while keeping comparable robustness with Cox's method against several
attacks such as noise addition, cropping, re-sampling, re-quantization, and MP3 compression.
112
Cox Method
60
60
40
40
20
20
0
0
500
(a)
1000
60
60
40
40
20
20
500
(b)
1000
500
(d)
1000
0
0
500
(c)
1000
Random watermark
Fig. 6. Watermark detector response against Gaussian noise attack using the Cox method: (a)
Let it be, (b) Symphony No. 5, (c) Hey Jude, (d) Human Voice
Table 4. Similarity results of the proposed system and Coxs method against different attacks
Types of Attack
Noise addition
Cropping
Re-sampling
Re-quantization
Types of Signal
Let it be
Symphony No 5
Hey Jude
Human Voice
Let it Be
Symphony No 5
Hey Jude
Human Voice
Let it Be
Symphony No 5
Hey Jude
Human Voice
Let it Be
Symphony No 5
Hey Jude
MP3 Compression
Low-pass filtering
Human Voice
Let it Be
Symphony No 5
Hey Jude
Human Voice
Let it Be
Symphony No 5
Hey Jude
Human Voice
Cox
Proposed
= 0.3
62.937
62.538
62.942
61.424
61.413
61.725
63.923
62.735
60.462
59.372
59.266
61.478
59.269
60.348
61.372
59.672
64.992
63.846
64.993
63.936
64.993
62.865
62.893
61.784
64.993
63.983
62.876
60.795
62.876
60.795
64.637
62.538
63.927
62.357
62.268
60.674
63.945
62.213
62.535
61.385
61.826
61.436
59.883
61.215
60.527
61.357
59.927
61.727
113
5 Conclusion
In this paper, we have presented a new watermarking algorithm in frequency domain
for audio copyright protection. Experimental results indicate that our proposed watermarking system shows better results than Cox's method in terms of imperceptibility
while keeping comparable robustness with Coxs method against several kinds of attacks such as noise addition, cropping, re-sampling, re-quantization, and MP3 compression. Our proposed method achieves SNR values ranging from 20 dB to 28 dB for
different watermarked sounds. This is in contrast to Cox's method which achieves SNR
values ranging from only 14 dB to 23 dB. These results demonstrate that our proposed
watermarking system can be a suitable candidate for audio copyright protection.
Acknowledgement
This work was supported by the National Research Foundation of Korea(NRF) grant
funded by the Korea government(MEST) (No. R01-2008-000-20493-0 and No. 20100010863).
References
1. Lie, W.N., Chang, L.C.: Robust and High-Quality Time-Domain Audio Watermarking
Based on Low-Frequency Amplitude Modification. IEEE Transaction on Multimedia 8(1),
4659 (2006)
2. Xiang, S., Huang, Z.: Histogram-based audio watermarking against time-scale modification
and cropping attacks. IEEE Transactions on Multimedia 9(7), 13571372 (2007)
3. Bassia, P., Pitas, I., Nikolaidis, N.: Robust Audio Watermarking in the Time domain. IEEE
Transaction on Multimedia 3(2), 232241 (2001)
4. Xie, L., Zhang, J., He, H.: Robust Audio Watermarking Scheme Based on Nonuniform Discrete Fourier Transform. In: IEEE International Conference on Engineering of Intelligent
System, pp. 15 (2006)
5. Zeng, G., Qiu, Z.: Audio Watermarking in DCT: Embedding Strategy and Algorithm. In:
9th International Conference on Signal Processing (ICSP 2009), pp. 21932196 (2008)
6. Huang, J., Wang, Y., Shi, Y.Q.: A Blind Audio Watermarking Algorithm with SelfSynchronization. In: IEEE International Symposium on Circuits and Systems (ISCAS
2002), vol. 3, pp. 627630 (2002)
7. Pooyan, M., Delforouzi, A.: Adaptive and Robust Audio watermarking in Wavelet Domain.
In: International Conference on International Information Hiding and Multimedia Signal
Processing (IIH-MSP 2007), vol. 2, pp. 287290 (2007)
8. Cox, I., Killian, J., Leighton, F., Shamoon, T.: Secure Spread Spectrum Watermarking for
Multimedia. IEEE Transactions on Image Processing 6(12), 16731687 (1997)
Abstract. Digital watermarking has been widely used for protecting digital contents from unauthorized duplication. This paper proposes a new watermarking
scheme based on spectral modeling synthesis (SMS) for copyright protection of
digital contents. SMS defines a sound as a combination of deterministic events
plus a stochastic component that makes it possible for a synthesized sound to attain all of the perceptual characteristics of the original sound. In our proposed
scheme, watermarks are embedded into the highest prominent peak of the magnitude spectrum of each non-overlapping frame in peak trajectories. Simulation
results indicate that the proposed watermarking scheme is highly robust against
various kinds of attacks such as noise addition, cropping, re-sampling, requantization, and MP3 compression and achieves similarity values ranging
from 17 to 22. In addition, our proposed scheme achieves signal-to-noise ratio
(SNR) values ranging from 29 dB to 30 dB.
Keywords: Digital watermarking, copyright protection, spectral modeling synthesis, digital contents.
1 Introduction
Recent years have seen rapid growth in the availability of digital media. A major problem faced by content providers and owners is protection of their material. Digital audio
watermarking, which is the process of embedding watermarks into an audio signal to
demonstrate authenticity and ownership, has drawn extensive attention for copyright
protection of audio data [1]. Audio watermarking schemes should meet the following
requirements: (a) Imperceptibility: the digital watermark should not affect the quality
of the original audio signal; (b) Robustness: unauthorized distributors should not be
able to remove or eliminate the embedded watermark data using common signal
processing operations; (c) Capacity: the numbers of bits that can be embedded into the
audio signal within a unit of time should be sufficient; (d) Security: the watermark
should only be detectable by an authorized person. These requirements are often
* Corresponding author.
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 114125, 2010.
Springer-Verlag Berlin Heidelberg 2010
115
contradictory. Since robustness and imperceptibility are the most important requirements for digital audio watermarking, these should be satisfied first.
In this paper, we propose a new watermarking scheme based on spectral modeling
synthesis (SMS) [8-9] for audio copyright protection. SMS extracts synthesis parameters out of real sounds using analysis procedures to reproduce and modify the original
sounds. This approach models sounds as stable sinusoids (partials) plus noise (residual components) to analyze sounds and generate new sounds. The analytic procedure
detects partials by utilizing the time-varying spectral characteristics of a sound, and
represents them as time-varying sinusoids [10]. These partials are then subtracted
from the original sound, in which the remaining residual is represented as a timevarying filtered white noise component. The synthesis procedure is a combination of
additive synthesis for the sinusoidal part and subtractive synthesis for the noise part
[9]. In our proposed watermarking scheme, the original audio is segmented into nonoverlapping frames and fast Fourier transform (FFT) is applied to each frame. Prominent spectral peaks of each frame are identified and removed to calculate the residual
spectrum. The residual component is computed by transforming the spectrum back to
the time domain using inverse FFT, and then adding the non-overlapping frames in
time. In addition, a peak tracking unit links peaks across frames to form trajectories.
Watermarks are then embedded into the most prominent peak of the magnitude spectrum of each non-overlapping frame in the peak trajectories. Each sinusoidal component is computed by sinusoidal synthesis. Finally, the watermarked signal is computed
by adding the sinusoidal component to the residual component in the time domain.
Watermarks are detected by the inverse operation of the watermark embedding process. Experimental results indicate that the proposed watermarking scheme shows
strong robustness against several kinds of attacks including noise addition, cropping,
re-sampling, re-quantization, and MP3 compression. In addition, our proposed
scheme achieves signal-to-noise ratio (SNR) values ranging from 29 dB to 30 dB.
The rest of this paper is organized as follows. Section 2 provides a brief description
of previous works related to audio watermarking. Section 3 introduces our proposed
watermarking scheme, including the watermark embedding process and watermark
detection process. Section 4 discusses the performance of our proposed scheme in
terms of imperceptibility as well as robustness. Finally, section 5 concludes this paper.
2 Previous Works
A significant number of watermarking techniques have been reported in recent years
in order to create robust and imperceptible audio watermarks. Some methods embed
the watermark in the time domain of the audio signal [1-2]. Other watermarking techniques use transform methods, such as Discrete Fourier Transform (DFT) [3], Discrete Cosine Transform (DCT) [4-5], or Discrete Wavelet Transform (DWT) [6] to
embed the watermark.
In Cox's method [7] watermarks are embedded into the highest n DCT coefficients
of a whole sound excluding the DC component according to the following equation:
vi = vi (1 + xi )
(1)
116
where vi is a magnitude coefficient into which a watermark is embedded, xi is a watermark to be inserted into vi, is a scaling factor, and vi' is an adjusted magnitude coefficient. The watermark sequence is extracted by performing the inverse operation of (1)
represented by the following equation:
xi* = (
vi*
1) /
vi
(2)
analysis
window
sound
magnitude
spectrum
FFT
phase
spectrum
peak
detection
det. freq.
pitch
detection
det. mag.
peak
continuation
det. phase.
additive
synthesis
window
generation
smoothing
window
*
residual signal
stoc. coeff.
spectral
fitting
stoc. mag.
117
Figure 2 shows a block diagram of the SMS synthesis process. The deterministic
component (sinusoidal component) is calculated from the frequency and magnitude
trajectories. The result of the synthesized stochastic signal is a noise signal with a
time varying spectral shape obtained in the analysis (i.e., subtractive synthesis). It can
be implemented by a convolution in the time domain or by a complex spectrum for
every spectral envelope of the residual and an inverse-FFT in frequency domain.
user controls
deterministic
frequencies
deterministic
magnitudes
deterministic
frequencies
musical
transformation
deterministic
magnitudes
additive
synthesis
noise
generation
stochastic
magnitudes
stochastic
coefficients
musical
transformation
deterministic
component
synthesized
sound
stochastic
component
subtractive
synthesis
user controls
118
Original sound
1
0. 8
0. 6
M
a
g
n
itu
d
e
0. 4
0. 2
0
-0. 2
-0. 4
-0. 6
-0. 8
-1
3
Time (sec )
Non-overlapping
frame
FFT
Peak
detection
Peak
continuation
4
-1
-2
-3
0
100
200
300
400
500
600
700
800
900
Embed watermarks
vi=vi(1+xi)
IFFT
Additive
synthesis
NOLA
1000
Original watermark
1
0. 8
0. 6
M
a
g
n
itu
d
e
0. 4
0. 2
0
-0. 2
-0. 4
-0. 6
-0. 8
-1
3
Time (sec )
Watermarked sound
Let it Be. Figure 5 shows the magnitude and phase spectrum of the selected frame for
the original audio signal.
Step 2: Prominent spectral peaks are detected from the frequency spectrum of each
frame using a peak detection algorithm. Figure 6 shows the peak detection of the selected frame for the original audio signal Let it Be.
Step 3: The peak tracking algorithm connects every peak in the ith frame to the (i+1)th
frame to form trajectories. Figure 7 shows the peak tracking of the original audio signal
Let it Be.
Step 4: The residual component is computed by removing all the prominent peaks from
the spectrum, transforming the spectrum back to the time domain by using inverse FFT
(IFFT), and then nonoverlap-adding (NOLA) the frames in the time domain. Figure 8
shows the residual component of the audio signal Let it Be.
Step 5: Watermarks are embedded into the most prominent peak of each frame as
shown in Figure 6 to obtain watermarked peaks V'= {v1', v2', v3',..., vn'} using the following equation:
vi = vi (1 + xi )
(6)
119
D (t ) = Ar (t ) cos [ r (t ) ]
r =1
(7)
th
and phase of the r sinusoid, respectively. Figure 9 shows the time domain representation of the sinusoidal component for the audio signal Let it Be.
Original signal
1
Magnitude
0.5
-0.5
-1
3
Time (sec)
Magnitude spectrum
40
3
20
2
Phase (radian)
Magnitude (dB)
-20
-40
1
0
-1
-2
-60
-3
-80
0.5
1
Frequency (Hz)
1.5
-4
0.5
1
Frequency (Hz)
x 10
1.5
2
4
x 10
Fig. 5. Magnitude and phase spectrum of the selected frame of the original audio signal Let it
Be
Phase spectrum
Magnitude spectrum
40
3
20
2
Phase (radian)
Magnitude (dB)
-20
1
0
-1
-40
-2
-60
-3
-80
0.5
1
Frequency (Hz)
1.5
-4
2
4
x 10
0.5
1
Frequency (Hz)
1.5
2
4
x 10
Fig. 6. Peak detection of the selected frame of the selected audio signal Let it Be
Step 7: Finally, the watermarked signal is computed by adding the sinusoidal component and the residual component in the time domain. Figure 8 shows the time domain
representation of the watermarked audio signal Let it Be.
120
2.2
Peak tracking
x 10
2
1.8
Frequency (Hz)
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
50
100
150
200
250
300
Frame number
350
400
450
500
Magnitude
0.5
-0.5
-1
3
Time (sec)
M
agnitude
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
3
Time (sec)
Non-overlapping
frame
FFT
Peak
detection
Attacked watermark
sound
1
0.8
0.6
Peak
continuation
M
agnitude
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
3
Tim e (sec)
Proposed method
vi
Proposed
method
vi*
Extract watermarks
xi*=(vi*/vi-1)/
4
-1
-2
-3
100
200
300
400
500
600
700
800
900
1000
Extracted watermark
121
vi*
1) /
vi
(8)
Magnitude
-1
3
4
(a)
Watermarked Signal
1
0
-1
3
(b)
Difference Signal
1
0
-1
3
(c)
Time (sec)
Fig. 10. Imperceptibility of watermarked audio using the proposed scheme: (a) Original audio
signal Hey Jude (b) watermarked audio signal Hey Jude (c) difference between original and
watermarked audio signal
122
X X*
SIM ( X , X * ) =
(9)
X*X*
SNR = 10 log10
N
n =1
N
n =1
S 2 ( n)
S (n) S * (n)
(10)
where S(n) and S*(n) are the original audio signal and watermarked audio signal, respectively. . In this study, the selected scaling factor value () value is 0.1 [7].
Table 1 shows the SNR result of the proposed scheme for the four selected different
watermarked sounds. Our proposed scheme achieves SNR values ranging from 29 dB
to 30 dB for different watermarked sounds.
Table 1. SNR results of Coxs method and proposed scheme for different watermarked sounds
Types of signal
Let it Be
Symphony No 5
Hey Jude
Human Voice
SIM
29.4109
30.6853
29.7108
30.1859
Table 2 shows the similarity results of our proposed scheme when no attack is applied
to four different types of watermarked audio signals.
Table 2. Watermark detection results of the proposed scheme without attack
Types of signal
Let it Be
Symphony No 5
Hey Jude
Human Voice
SIM
22.2559
21.9061
22.1179
20.0611
123
In order to test the robustness of our proposed scheme, different types of attacks,
summarized in Table 3, were performed on the watermarked audio signals.
Table 3. Attacks used in this study to test the watermarked sound
Attacks
Description
Noise addition
Figures 11 shows the response of the watermark detector to 1000 randomly generated watermarks using the proposed scheme against cropping attack where the correct
th
watermark is at the 500 position for different watermarked sounds.
Proposed Method
20
20
10
10
0
0
500
(a)
1000
20
20
10
10
500
(b)
1000
500
(d)
1000
0
0
500
(c)
1000
0
Random watermark
Fig. 11. Watermark detector response against cropping attack using the proposed method:
(a) Let it Be, (b) Symphony No. 5, (c) Hey Jude, (d) Human Voice
Table 4 shows the similarity results of the proposed scheme in terms of robustness
against several kinds of attacks applied to four different types of watermarked audio
signal Let it be, Symphony No 5, Hey Jude, and human voice respectively for
=0.1.
124
Noise addition
Cropping
Re-sampling
Re-quantization
MP3
Compression
Types of Signal
Let it be
Symphony No 5
Hey Jude
Human Voice
Let it Be
Symphony No 5
Hey Jude
Human Voice
Let it Be
Symphony No 5
Hey Jude
Human Voice
Let it Be
Symphony No 5
Hey Jude
Human Voice
Let it Be
Symphony No 5
Hey Jude
Human Voice
SIM
18.9603
18.0536
13.3849
17.9699
21.0315
20.1004
21.1396
19.1142
22.2551
21.5139
19.3270
20.0213
22.2378
21.2504
21.7355
20.0142
22.1518
21.4009
22.0136
19.9082
6 Conclusion
In this paper, we have presented a new watermarking scheme based on SMS for audio
copyright protection. Watermarks are embedded into the most prominent peaks of the
magnitude spectrum of each non-overlapping frame in peak trajectories. Experimental
results indicate that our proposed watermarking scheme shows strong robustness
against several kinds of attacks such as noise addition, cropping, resampling, requantization, and MP3 compression and achieves similarity values ranging from 17 to
22. Moreover, our proposed scheme achieves SNR values ranging from 29 dB to 30
dB for different watermarked sounds. These results demonstrate that our proposed
watermarking scheme can be a suitable candidate for audio copyright protection.
Acknowledgement
This work was supported by the National Research Foundation of Korea(NRF) grant
funded by the Korea government(MEST) (No. R01-2008-000-20493-0 and No. 20100010863).
References
1. Lie, W.N., Chang, L.C.: Robust and High-Quality Time-Domain Audio Watermarking
Based on Low-Frequency Amplitude Modification. IEEE Transaction on Multimedia 8(1),
4659 (2006)
125
2. Bassia, P., Pitas, I., Nikolaidis, N.: Robust Audio Watermarking in the Time domain. IEEE
Transaction on Multimedia 3(2), 232241 (2001)
3. Xie, L., Zhang, J., He, H.: Robust Audio Watermarking Scheme Based on Nonuniform
Discrete Fourier Transform. In: IEEE International Conference on Engineering of Intelligent System, pp. 15 (2006)
4. Zeng, G., Qiu, Z.: Audio Watermarking in DCT: Embedding Strategy and Algorithm. In:
9th International Conference on Signal Processing (ICSP 2009), pp. 21932196 (2008)
5. Huang, J., Wang, Y., Shi, Y.Q.: A Blind Audio Watermarking Algorithm with SelfSynchronization. In: IEEE International Symposium on Circuits and Systems (ISCAS
2002), vol. 3, pp. 627630 (2002)
6. Pooyan, M., Delforouzi, A.: Adaptive and Robust Audio watermarking in Wavelet Domain. In: International Conference on International Information Hiding and Multimedia
Signal Processing (IIH-MSP 2007), vol. 2, pp. 287290 (2007)
7. Cox, I., Killian, J., Leighton, F., Shamoon, T.: Secure Spread Spectrum Watermarking for
Multimedia. IEEE Transactions on Image Processing 6(12), 16731687 (1997)
8. Serra, X., Smith, J.: Spectral modeling synthesis: A sound analysis/synthesis system based
on deterministic plus stochastic decomposition. Computer Music Journal 14(4), 1224
(1990)
9. Serra, X.: Musical sound modeling with sinusoid plus noise. Musical Sound Processing,
published in Roads, C., Pope, S., Picialli, A., De Poli, G. (eds.) by Sweets and Zeitlinger
Publishers, pp. 91122 (1997)
10. Depalle, P., Garcia, G., Rodet, X.: Tracking of Partials for Additive Sound Synthesis Using Hidden Markov Models. In: IEEE International Conference on Acoustics, Speech and
Signal Processing, vol. 1, pp. 225228 (1993)
1 Introduction
Reversible data hiding, also referred to as reversible watermarking, has been a newly
developed branch and an interesting topic in watermarking researches [1, 2]. Suppose
that the cover media for secret transmission of data are the digital images. With the
term reversibility, at the encoder, the user-defined data should be embedded into the
original image by some means. And at the decoder, both the data and the original
image should be recovered. This means that the extracted data and the recovered
image should be identical to their counterparts at the encoder. Therefore, how to
design an effective algorithm for reversible data hiding is an interesting task in both
research and application.
There are requirements for designing a good reversible data hiding algorithm [3, 4, 5].
Three major requirements are: (a) the output image quality, called the imperceptibility,
(b) the number of bits that can be hidden into the cover image, called the capacity, and (c)
the overhead or side information that is necessary for performing data extraction at the
decoder. And we can see that these requirements have some correlations or conflict with
the others. For instance, embedding more capacity into the cover image would lead to the
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 126133, 2010.
Springer-Verlag Berlin Heidelberg 2010
127
more deterioration to the output image quality, and hence the degraded result in
imperceptibility. From the viewpoint of practical applications, less overhead is much
required. Therefore, how to develop a reversible data hiding algorithm that can hide more
capacity, produce the output image quality with acceptable imperceptibility hat aims at
utilizing the characteristics of its original counterpart, and generate as less overhead as
possible is the major contribution of this paper.
This paper is organized as follows. In Sec. 2, we present fundamental descriptions
of conventional histogram-based reversible data hiding algorithm. Then, in Sec. 3, we
take the quadtree decomposition into consideration and look for the integration with
the algorithm presented in Sec. 2. Simulation results are demonstrated in Sec. 4,
which suggest the applicability of the algorithm and the integration proposed. Finally,
we conclude this paper in Sec. 5.
128
Step 3. Modify of luminance values in selected range. In the region between max
and zero points recorded in Step 2, luminance values between the max and
zero points are altered in advance. Luminance values in the selected range
are all increased by 1.
Step 4. Embed the data. For the embedding of binary watermark, if the watermark
bit is 1, the luminance value is increased by 1; if the watermark bit is 0,
it is decreased by 1.
In extracting both the hidden data and the original image, the following steps
should apply accordingly.
Step 1. Locate selected range with side information. Luminance values between
the max and zero points are compared.
Step 2. Extract the hidden data relating to the original. Every pixel in the output
image is scanned and examined sequentially to extract the data bits to
compare to Step 3 of the embedding procedure.
Step 3. Obtain the original image. By moving the histogram into its original form,
the original content is recovered. Only the max point is required.
We can see that performing data hiding is simply by shifting certain parts of the
histogram of the image, and the luminance values of the max and zero points play an
important role for making reversible data hiding possible.
2.2 Advantages and Drawbacks of the Conventional Scheme
The histogram-based reversible data hiding has the advantages of ease of
implementation and little side information produced. On the contrary, the limited
amount of embedding capacity is the major drawback for this algorithm. With the
descriptions in Sec. 2.1, we observe the advantages of the histogram-based reversible
data hiding, which are depicted as follows.
On the contrary, the major drawback of the histogram-based reversible data hiding
is described as follows.
129
From the observations above, we are able to take the characteristics of the original
image into consideration, and try to increase the capacity at the expense of somewhat
degraded quality of the output image. Side information should be comparable to that in
the conventional scheme. Therefore, we employ the concept of quadtree decomposition
for the implementation of histogram-based reversible data hiding algorithm.
Keep decomposition;
Stop.
(1)
We set T = 255, [0, 1] . The value of can be adjusted by the users. Thus,
for smaller threshold values, more blocks will be produced, and the size of the block
map B will grow accordingly.
130
(a)
(b)
Fig. 1. Test materials in this paper. (a) The gray level image airplane with size 512 512 . (b)
The quadtree structure for airplane, following the characteristics of the image.
Next, each square block in Fig. 1(b) can be regarded as a small image. From the
descriptions in Sec. 2, we can see that after the embedding with histogram-based
reversible data hiding, the two luminance values for max and zero points, namely, ai
for the max point and bi for the zero point for the i -th block, respectively, and each
can be represented by 8 bits, are served as the side information. Without loss of
generality, we assume that ai < bi , i . Under the extreme case that when the original
image is highly active that implies that more blocks would be necessary for
decomposing the original image, and each block in the image is represented by
16 16 block, there would be 512
512
= 1024 blocks in total, and
16
16
the values of bi to be the luminance value at zero point of the whole image, b , to
reduce the overhead. By doing so, at most 1024 8 + 8 = 8200 bits of side information
of the block map B is produced.
Finally, at the third round, after embedding the user-defined information I at the
second round, histogram-based embedding should be performed on X , and the max
point in X , a , should be at least 8200. If the max point in X is incapable of
embedding 8200 bits, we will search for luminance values c < a , such that the
occurrences for both a and c are greater than 8200. After the embedding of the
block map B is performed, the two-byte side information, containing a and c , is
transmitted to the decoder. That is, S = (a, c) .Therefore, with our algorithm, very
little overhead is needed for decoding. We can see that this amount of side
information is comparable to those shown in literature.
131
The goal for the extraction process is to recover both the original image X and the
user-defined information I at the decoder, with the two-byte side information a and
c . These are the reverse operations to the embedding process, and they can be
outlined as follows.
Round 1. Perform the histogram-based reversible data extraction on X with the
side information S . Reconstruct the block map B and obtain the
luminance value of the max point for each block, ai .
Round 2. Reconstruct the image X , which denote the original image X with the
user-defined information I .
Round 3. Generate the user-defined information I and the original image X with
the conventional histogram-based reversible data extraction.
4 Simulation Results
We perform the following simulations for the evaluation of our algorithm. All the three
requirements are evaluated in comparison with conventional algorithm, including:
the output image quality, represented by PSNR, after hiding the user-defined
data;
the capacity, represented by bits, of the user-defined data;
the size of the overhead.
The first two requirements are easy for making comparisons. For the third one, we
find that in Fig. 1(b), the block sizes after performing quadtree decomposition are
different. Therefore, the different block sizes compose of the block map B for
making data embedding and extraction possible. In order to make fair comparisons,
we divide the original image into blocks with the same sizes of 512 512 , 256 256 ,
128 128 , 64 64 , 32 32 , and 16 16 , respectively. In contrast with quatree
decomposition, these configurations have regular patterns, and only one block size
needs to be included into the block map. Hence, except for the block size, only the
luminance of the max point in each block should be included in the block maps.
On the contrary, with quadtree decomposition, we may expect the more overhead
in the block map B because the side information of each block is composed of the
block size and the luminance of max point. In order to make fair comparisons, even
132
Block size
Image quality
(dB)
Capacity (bit)
512 512
256 256
128 128
64 64
32 32
16 16
Quadtree
57.48
51.51
50.37
49.96
50.00
49.78
49.97
30448
50844
75147
89979
97293
101475
95712
Test image
Image size
Output
image
quality (dB)
Capacity
Increase
in
capacity
over
existing one (%)
MSE
between
original
and
recovered images
BER
between
embedded
and
extracted info
airplane
F-16
Lena
pepper
tank
truck
512 512
512 512
512 512
512 512
512 512
512 512
49.97
51.34
51.13
51.03
51.80
51.81
95712
25682
14303
12934
15254
15381
214%
188%
384%
335%
61%
67%
0.00
0.00
0.00
0.00
0.00
0.00
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
though the block map with quadtree decomposition is larger than the other cases,
Round 3 of the embedding process comprise the block map even though somewhat
degradation in output image quality can be expected. In order to embed the block map
in Sec. 3.1, in the airplane image in Fig. 1, we choose a = 181 and c = 172 to serve
as the side information, and the block map B can be embedded.
Table 1 depicts the comparisons between the image quality and embedding
capacity under a variety of block sizes. For the block size of 512 512 , we see that
the capacity is 30448 bits with the PSNR of 57.48 dB. These serve as a reference
corresponding to existing scheme in [5]. We can see that when the block size gets
smaller, more capacity can be embedded at the expense of degraded quality. Because
the image qualities under all the block sizes are larger than 48.13 dB, we can claim
that these qualities are acceptable. We can see that with quadtree decomposition, the
performances accessed by quality and capacity lie between the block sizes of 32 32
and 128 128 . If we set the PSNR the same, we can see that with quadtree
decomposition, our algorithm can hide more capacity than the 64 64 case.
Table 2 represents the simulation results with different test images. The image
sizes are all 512 512 for making fair comparisons. Regarding to the image qualities,
they are all more than 48.13 dB. Corresponding capacities are also provided, and we
can see that the capacities are highly correlated to the characteristics of original
images, ranging from 12934 to 95712 bits. Next, the increases for the quadtree
decomposition over existing method are provided. For instance, for the airplane
image, the capacities for the quadtree and the conventional methods [5] are 95712 and
133
30448 bits, respectively. Therefore, we can easily calculate the increase in percentage
by ( 95712
1) 100% = 214% .
30448
At the decoder, after decoding with the 2-byte side information, the block map can
be produced, and then both the original image and the user-defined information can
be extracted. For verifying the reversibility of our algorithm, regarding to the image
itself, we can see that all the mean square errors (MSEs) are 0.00, meaning that the
recovered images are identical to their original counterpart. On the other hand, for the
user-defined information, we can see that the bit error rates (BERs) between the
embedded and extracted ones are all 0.00%, meaning that they are identical.
Therefore, from the data shown in the bottom two rows in Table 2, we prove that our
data hiding algorithm can reach the goal of reversibility.
5 Conclusions
In this paper, we proposed the three-round reversible data hiding algorithm with the
characteristics of original image. Quadtree decomposition is taken into account in
order to increase the embedding capacity with the acceptable quality of output image.
With our simulation results, we have obtained the more embedding capacity, the
acceptable output image quality, and the comparable amount of side information
produced. Most important of all, we have verified that the proposed algorithm can
reach the goal of reversibility.
We have conducted simulations based on the easily implemented algorithm by
modifying the histogram of original image. The other branch for reversible data
hiding, based on modifying the differences between consecutive pixels [6], can also
be taken into account for further improvement of our algorithm in the future.
Acknowledgments. The authors would like to thank National Science Council
(Taiwan, R.O.C) for supporting this paper under Grant No. NSC98-2221-E-390-017.
References
1. Pan, J.S., Huang, H.-C., Jain, L.C., Fang, W.C. (eds.): Intelligent Multimedia Data Hiding.
Springer, Heidelberg (2007)
2. Huang, H.-C., Fang, W.C.: Metadata-Based Image Watermarking for Copyright Protection.
Simulation Modelling Practice and Theory 18, 436445 (2010)
3. Luo, L., Chen, Z., Chen, M., Zeng, X., Xiong, Z.: Reversible Image Watermarking Using
Interpolation Technique. IEEE Trans. Inf. Forensics and Security 5, 187193 (2010)
4. Sachnev, V., Kim, H.J., Nam, J., Suresh, S., Shi, Y.-Q.: Reversible Watermarking
Algorithm Using Sorting and Prediction. IEEE Trans. Circuits Syst., Video Technol. 19,
989999 (2009)
5. Ni, Z., Shi, Y.-Q., Ansari, N., Su, W.: Reversible Data Hiding. IEEE Trans. Circuits Syst.,
Video Technol. 16, 354362 (2006)
6. Alattar, A.M.: Reversible Watermark Using the Difference Expansion of a Generalized
Integer Transform. IEEE Trans. Image Process. 13, 11471156 (2004)
1 Introduction
System security is the one of important issues to manage critical computing systems.
Especially, malicious codes could leak confidential information of the individual and
the enterprise, and infect hosts. Malware abuses incompleteness of system software
and utilizes mistakes of system manager [1][2][3]. Thus, analyzing various malicious
codes is essential to guarantee system security. However, complexity of obfuscation
techniques has been increased continuously and newly modified malicious codes are
generated exponentially. Therefore, more efficient and accurate analysis methods are
required for secure system environments.
Recently, malware analysis methods which utilize virtualization technology are introduced such as Win32 API hooking, emulator, and full virtualization. Win32 API
hooking shows excellent performance in the system call tracing, but cant trace instruction and memory access. Emulator, which realizes hardware function as software, can
trace instruction and memory access, but it is too much slow and vulnerable to antidebugging [4][5][6][7][8][9]. The full virtualization based on the binary translation is
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 134141, 2010.
Springer-Verlag Berlin Heidelberg 2010
135
faster than emulator, but it is also vulnerable to anti-debugging. In case of full virtualization based on hardware-assisted VT, it shows better performance and escapes antidebugging methods easily. Therefore, we propose the enhanced dynamic malware
analysis system based on hardware-assisted VT.
2 Background
Dynamic code analysis method is utilized for various purposes such as observing
behavior of process, developing applications, and so on. We design and implement
dynamic code analysis system for analyzing malware. Therefore, in this section, we
describe representative dynamic code analysis methods.
Because emulating environments is isolated from the real system, the emulator is
utilized for various purposes. The emulator provides a virtual computing system
through only software implementation without hardware support. Certainly, linux and
other operating systems also can be executed on the emulator without code modification. Representative emulators are QEMU and Bochs. Also, representative dynamic
code analyzers based on the emulator are BitBlaze [10], Renovo [11], VMScope [12],
TTAnalyzer [13]. However, there is limitation to utilize emulator because of considerably slow executing speed.
In case of programs executing on general operating system, only general registers
and user memory region are accessible. And another resource can be used through
system call. On the other hand, the user-level dynamic code analyzer manage user
context using shadow register, memory, and system call redirection. So, the user-level
code analyzer can monitor all contexts without system call procedure in kernel context. Representative user-level code analyzers are Valgrind and Pin [14][15]. Even
though user-level code analyzer is faster than emulator, it needs binary translation and
can be easily exposed to anti-debugging.
Dinaburg proposed a machine-level dynamic code analyzer, Ether [16]. Ether is
Xen based dynamic code analyzer using hardware-assisted function. Utilizing obfuscation break point of VAMPiRE, Ether supports memory trace, system call trace, and
limited process trace. However, Ehter cannot analyze detail context such as DLL
loading and API call information.
3 MAS
In this section, we introduce MAS: a transparent malware analysis system based on
hardware-assisted virtualization technology. MAS provides two analysis phases: a
single step phase, and a system call phase. Each analysis phase has special features. In
the single step phase, MAS supplies detail analysis results of a whole process behavior. Especially, the single step phase is useful to analyze specific process. Simply
observing behavior of process during user context, MAS provides an efficient analysis environment. Then, in the system call phase, MAS shows high performance close
to a non-analysis phase.
136
T. Kim et al.
We set a target process which runs on Windows guest OS. When the target process
is switched by the scheduler, MAS begins analysis of target process. Invoking
VMEXIT, MAS is able to analyze a whole context of guest depending on its purpose.
Each VMEXIT handler carries out their duties such as page fault, general protection
fault, and debug exception. A core analyzer component interprets instruction and
manages process list which contains target process name. Also, the results of analyses
are recorded whenever needed.
3.2 Single Step Phase
In the single step phase, the target process is observed by a unit of the instruction. For
a complete single step analysis, MAS is designed to occur VMEXIT when guest OS
access to cr register, or when debug exception is raised. The beginning of single step
is to find a target process among all processes executing on guest OS. By the initialization setting, MAS invokes VMEXIT when the guest OS accesses to CR3 register
for the context switching. In this case, the VMEXIT handler searches for a current
process name in process lists to conform whether a current process is the target process or not. If the current process is the target process, MAS performs the single step
analysis. Therefore, repeating the context switching check, we can select our concerned process. Fig. 2 shows the operation of the single step phase.
137
Once the target process is detected during context switching procedure, MAS sets
up a trap flag of EFLAG register on x86 CPU. If the trap flag is turned on, guest OS
will raise a debug exception immediately in the next instruction execution. According
to the initialization setting of MAS, VMEXIT is occurred when the debug exception
is raised. Then, the core analyzer interprets current context such as process id, instruction, register, stack, and memory. After logging necessary information, the trap flag is
turned on once again. Therefore, the single step is continued until the not-target process is switched.
Furthermore, we can distinguish the current privileged level between the user level
and the kernel level. Generally, it is required to monitor the user level context during
the process execution for the malware analysis. Furthermore, malware frequently
accesses disk and connects network. In many cases, these kinds of job require kernel
services. And, malware which downloads new codes from the internet may require
the procedure of network initialization. During the network connection, establishing
connection is completed within a certain period of time. If the connection time is
delayed by the process monitoring procedure, it is difficult to analyze the behavior of
malware accurately. Therefore, our ring level detection scheme is powerful to monitor
various malicious codes.
Ring level detection scheme is designed utilizing general protection fault. By setting guest machine to load a invalid descriptor during the context switching, we lead
guest machine to invoke VMEXIT by the general protection fault. When the user
mode is changed to kernel mode, and vice versa, VMEXIT is invoked. Then, we set
the single step phase on/off state suitably. Thus, we can monitor only the user context.
Using single step, MAS supports several functions to meet requirements for dynamic malware analysis. Basically, instruction analysis is supported. When instruction accesses memory to read/write, corresponding related information are recorded in
the logging file. Also, if instruction calls API, arguments and a return value of API
call are recorded in the logging file. Fig 3 shows the algorithmic flow of single step.
138
T. Kim et al.
139
utilizes packing methods in order to make analyzers to take a long time for analyzing
the executing code. Using packing methods, malware can practice their object until
packing is unpacked. Therefore, if packed malware is unpacked quickly, propagation
of malware is minimized and the damage could be decreased.
Using MAS, we analyze the malicious packing code. We utilize the feature of the
packing method. Fundamentally, packing codes copy some codes to specific address
to restore original code. After unpacking the code, the corresponding code is executed
on original entry point. Namely, the unpacking time is when the written code is executed. Therefore, we monitor the packing malware using a single step phase. We can
analyze every instructions, memory accesses, and API calls. Especially, we trace a
memory write operation. When there is the memory write operation, we record values
of corresponding value of EIP, other registers, and instruction. Then, if the instruction
on the logged address is executed, unpacked time is found finally. As above, MAS
supports efficient malware analysis environment.
4.2 Performance Evaluation
1.004
1.057
1.004
1.004
1.002
1.002
analyzed(system call)
1.000
1.033
baseline
1.002
1.1
e
m
i
t
n 1.05
o
i
t
u
c
e
1
x
e
d
e
z
i 0.95
l
a
m
r
o 0.9
n
1.040
The purposes of our experiments are to evaluate overhead of the system call phase
and demonstrate effectiveness in the ring level detection scheme of the single step
phase. In order to evaluate performance overhead, we utilize SPEC CPU 2006
benchmark. SPEC CPU 2006 is a CPU-intensive benchmark suite. We use 10 benchmarks among 29, which is composed of 12 integer benchmarks and 17 floating point
benchmarks. Fig. 4 shows performance comparison of the system call phase in the
proposed system. Baseline is the normalized running time when analysis phase is not
applied. Total overhead of the whole benchmark is small, 1.33%. High overheads of
some cases which need a lot of system call are lower than 6%.
140
T. Kim et al.
In order to demonstrate effectiveness in the ring level detection scheme of the single step phase, we illustrate normalized execution time of representative GNU tools.
Fig. 5 shows performance comparison of the single step phase in our system. Baseline
is the general single step phase in which the overall execution context including both
user context and kernel context are analyzed. User-only is the analysis mode which
analyze only user context by the ring level detection scheme. Execution time of tar,
md5sum, gzip, and wget are reduced significantly. Their performance gains over
baseline are 48%, 32%, 14% and 42% respectively. In overall performance, the proposed system achieves 21% performance improvement.
1.2
baseline (User&Kernel)
User-only
1.1
1
0.86
0.9
0.8
0.68
0.7
0.6
0.58
0.52
0.5
0.4
0.3
tar
md5sum
gzip
wget
5 Conclusion
In this paper, we introduced the enhanced malware analysis system based on hardware-assisted virtualization technology. Our system provided two analysis phases:
single step phase and system call phase. In the single step phase, the detailed and
various analyses are supported such as instruction tracing, memory tracing, and API
call tracing. Especially, the ring level detection scheme strengthened efficiency of the
single step phase. In case of the system call phase, it provides more efficient analysis
environment. We implemented our system on KVM hypervisor using Intel VT-x.
Throughout typical malware analyses, we show that our system provides the appropriate and efficient analysis environment. Our performance evaluations show that the
system call phase of MAS has small performance overhead and the single step phase
based on the ring level detection scheme significantly improves performance of
analysis according to exclude analysis of kernel context.
Acknowledgement
This work is financially supported by the Ministry of Education, Science and Technology(MEST), the Ministry of Knowledge Economy(MKE) through the fostering
project of HUNIC.
141
References
1. Idika, N., Mathur, A.P.: A Survey of Malware Detection Techniques. Research, Dept. of
Computer Science, Purdue Univ. (2007)
2. Carvey, H.: Malware analysis for windows administrators. Digital Investigation 2, 1922
(2005)
3. Pfleeger, C.P., Pfleeger, S.L.: Security in Computing. Prentice Hall, Englewood Cliffs
(2003)
4. Garfinkel, T., Adams, K., Warfield, A., Franklin, J.: Compatibility is Not Transparency:
VMM Detection Myths and Realities. In: Proc. of 11th Workshop on Hot Topics in Operating Systems (2007)
5. Ferrie, P.: Anti-unpacker tricks. In: CARO Workshop (2008)
6. Ferrie, P.: Attacks on Virtual Machines. In: AVAR Conf., pp. 128143 (2006)
7. Listion, T., Skoudis, E.: On the Cutting Edge: Thwarting Virtual Machine Detection.
SANS Internet Storm Center (2006)
8. Chen, X., Andersen, J., Mao, Z.M., Bailey, M., Nazario, J.: Towards an Understanding of
Anti-virtualization and Anti-debugging Behavior in Morden Malware. In: DSN 2008,
pp. 117186 (2008)
9. Xu, M., Malyugin, V., Sheldon, J., Venkitachalam, G., Weissman, B.: ReTrace: Collecting
Execution Trace with Virtual Machine Deterministic Replay. In: Proc. of 2007 Workshop
on Modeling, Benchmarking and Simulation (2007)
10. BitBlaze Binary Analysis Platform, http://bitblaze.cs.berkeley.edu
11. Kang, M.G., Poosankam, P., Yin, H.: Renovo: A Hidden Code Extractor for Packed Executables. In: Proc. of WORM (2007)
12. Jiang, X., Wang, X., Xu, D.: Stealthy Malware Detection Through VMM-Based Out-ofthe-Box Semantic View Reconstruction. In: Proc. of CCS, pp. 128138 (2007)
13. Bayer, U., Kruegel, C., Kirda, E.: TTanalyze: A Tool for Analyzing Malware. In: Proc. of
EICAR, pp.180192 (2006)
14. Instrumentation Framework for building dynamic analysis tools,
http://valgrind.org
15. A Dynamic Binary Instrumentation Tool, http://pintool.org
16. Dinaburg, A., Royal, P., Sharif, M., Lee, W.: Ether: Malware Analysis via Hardware Virtualization Extensions. In: Proc. of ACM CCS (2008)
Abstract. In order to decrease information security threats caused by humanrelated vulnerabilities, an increased concentration on information security
awareness and training is necessary. There are numerous information security
awareness training delivery methods. The purpose of this study was to
determine what delivery method is most successful in providing security
awareness training. We conducted security awareness training using various
delivery methods such as text based, game based and a short video presentation
with the aim of determining user preference delivery methods. Our study
suggests that a combined delvery methods are better than individual secrity
awareness delivery method.
1 Introduction
As organisations of all sizes continue to depend on information technology to reduce
costs and improve services, so have the likelihood of security risks that could damage
or disable systems. Generally, organisations realise that information security is a
critical facet to maintain profitability and a competitive edge. However, organisations
tend to be more concerned about vulnerability to external threats. As a result, there
has been increased spending on information security specialists and technology-based
solutions. However, recent research suggests that a substantial proportion of security
incidents originate from inside the organisation [1]. Many insider problems stem from
ignorance rather than malicious motivation but this is equally dangerous because
accidental failures can have large impacts. Thus, as organisations become more reliant
on technology to achieve their business objectives, these mistakes become more
critical and more costly.
The number of layers of technological defences can be as strong as possible, but
information security is only as strong as its weakest link. Undeniably, technologybased information security solutions (e.g., firewalls, antivirus software, and intrusion
detection system) are very important part of information security programs. Equally
important though is the human factors such as user awareness and operator
carelessness are as important as technical solutions [6]. It takes only a minor mishap
from the people who access, use, administer, and maintain information resources to
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 142148, 2010.
Springer-Verlag Berlin Heidelberg 2010
143
144
There study also suggests that while people can protect themselves from familiar
risks, they tend to have difficulties generalizing what they know to unfamiliar risks.
Phishing attacks exploit the fact that users tend to trust email messages and web
sites based on cues that actually provide little or no meaningful trust information.
They tend to target the most common activities (email and web) that the majority of
users spend substantial times on. Also, phishers are increasingly setting their sights on
such social networking sites as LinkedIn, Myspace, Facebook and Twitter. There is
high level of user unfamiliarity with common phishing attacks suggesting educating
users about online safety practices [2]. Lack of awareness of the dangers of reverse
social engineering attacks can result in an unsuspecting employee disclosing company
confidential information. Although automated systems can be used to identify some
fraudulent email and web sites, these systems are not completely accurate in detecting
phishing attacks [3].
145
legitimate sites [3]. Game-based delivery methods are highly interactive and can be
used to support organisational security training objectives while engaging typical
users. It is believed that game-based awareness training delivery method offer an
effective alternative to, or supplement for, more traditional modes of education [1].
Simulation-based information security awareness and training delivery methods
have also been receiving some attentions. In simulation-based delivery model, users
are sent simulated phishing emails by the experimenters to test users vulnerability to
phishing attacks. At the end of the study, users are given materials that inform them
about phishing attacks. A study that compared simulation-based and pamphlet-based
delivery models concluded that users who were sent the simulated phishing emails
and follow-up notification were better able to avoid subsequent phishing attacks than
those who were given a pamphlet containing information on how to combat phishing
[11]. A similar approach, called embedded training that teaches users about phishing
during their regular use of email is described in [4].
Information security awareness is an absolute necessity to ensure employees
discharge their duties in the most secure way possible to prevent information security
incidents. A major challenge with security awareness programs is the lack of a fully
developed methodology to deliver them [10].
4 Methedology
The aim of the study was not to generalise, but to interpret some users experiences of
information security delivery methods. The design and analysis of this study draws on
methodological experiences from those used in [3]. A total of 30 voluntary participants
were involved in this study. The participants were chosen in such a way that we had a
wide range of demographics. Each respondent completed pre-knowledge information
questionnaires. All the participants are end users and 25% of them had received formal
security training. Also 62% of the participants indicated that they enjoy playing games
at least when playing for solely entertainment purposes. Furthermore, over 70% of the
participants are reported as currently working either in a full time or a part time job.
We tested our participants ability to identify phishing using a video-based, a
game-based (i.e., Anti-Phishing Phil) and a text-based delivery models. The criteria
we used for selecting appropriate awareness delivery method was that the materials
should be easy to comprehend by non-technical users while at the same time not
cause undue frustration with technical users. Most importantly, we were looking for
materials that were short, to the point and easily accessible via a web browser, as we
didn't want to consume too much of peoples time by getting them to read lots of
information to participate in this study.
We first asked the participants if they knew what phishing was. We then
administrated the game-based awareness followed by the text-based delivery model,
watch a short video and finally play the game again. After experiencing each different
awareness training method, we collected data to see if the awareness training method
improved the knowledge of the participants about phishing. At the end of the study,
the participants were asked to complete an exit question that asked them what training
method they felt informed them the most about the phishing attack.
146
Note that the game-based delivery method consists of four rounds and each round
focuses on a different class of phishing URLs. Also, each round is timed to limit how
long one can take to consider the URLs. Ideally, an awareness program must
influence behaviour changes that deliver measurable benefits. In order to see how
much the text-based and the video-based awareness delivery methods increased the
knowledge of the participants about phishing, we administrated the game-based
awareness model one last time. This would also separate personal thought of those
participants from what they had actually learnt.
Figure 2 shows the outcome of the experiment before and after the text-based and
video-based awareness activities were administered to the participants. From the
graph, it is clear that both the video-based and the text-based delivery methods
improved the participants knowledge about phishing attacks. The results for each
round improved approximately by 50% on the second attempt of the game. Whilst
147
these statistics present alarming support for video and text-based security awareness
training, one cannot overlook the possibility that participants learnt by their mistakes
after the first attempt and self-corrected their answers on the second attempt to suit.
Also, A broadening in the knowledge of anti-phishing techniques was certainly
evident through the text-based delivery model. The game-based delivery method was
able to provide knowledge as to what to look for in the URLs, the text-based and the
video-based awareness methods added the knowledge that emails are the main carrier
of phishing attacks.
6 Conclusions
Despite the fact that people are the weakest link in the information security chain,
organisations often focus on the technical solutions. In this paper, we looked at
several information security awareness delivery methods in terms of their
effectiveness in providing security awareness training. Our results suggest all
information security awareness training delivery methods are powerful means of
empowering people with knowledge on focused topics. Further, our investigation
suggests that combining methods regularly improves the success of a security
awareness campaign and helps keep the target audiences interested.
References
1. Cone, B.D., Thompson, M.F., Irvine, C.E., Nguyen, T.D.: Cyber Security Training and
Awareness Through Game Play, Security and Privacy in Dynamic Environments. In: IFIP
International Federation for Information Processing 2006, vol. 201, pp. 431436 (2006)
148
2. Wu, M., Miller, R.C., Garfinkel, S.L.: Do Security Toolbars Actually Prevent Phishing
Attacks? In: Grinter, R., Rodden, T., Aoki, P., Cutrell, E., Jeffries, R., Olson, G. (eds.)
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI,
Montral, Qubec, Canada, April 22-27, pp. 601610. ACM Press, New York (2006)
3. Sheng, S., Magnien, B., Kumaraguru, P., Acquisti, A., Cranor, L.F., Hong, J., et al.: AntiPhishing Phil: The Design and Evaluation of a Game That Teaches People Not to Fall for
Phish. In: Symposium On Usable Privacy and Security (SOUPS) 2007, Pittsburgh, PA,
USA, July 18-20 (2007)
4. Kumaraguru, P., Rhee, Y., Acquisti, A., Cranor, L., Hong, J., Nunge, E.: Protecting People
from Phishing: The Design and Evaluation of an Embedded Training Email System. In:
Proceedings of the 2007 Computer Human Interaction, CHI (2007)
5. Albrechtsen, E.: A qualitative study of users view on information security. Computers and
Security 26(4), 276289 (2007)
6. Abawajy, J.H., Thatcher, K., Kim, T.-h.: Investigation of Stakeholders Commitment to
Information Security Awareness Programs. In: 2008 International Conference on
Information Security and Assurance (ISA 2008), pp. 472476 (2008)
7. Downs, J., Holbrook, M., Cranor, L.: Decision strategies and susceptibility to phishing. In:
Proceedings of the Second Symposium on Usable Privacy and Security (SOUPS 2006),
vol. 149 (2006)
8. Prenski M.: Digital game-based learning. McGraw-Hill, New York (2001); Gredler, M.E.:
Games and simulations and their relationships to learning. In: Handbook of Research on
Educational Communications and Technology, 2nd edn., pp. 571581. Lawrence Erlbaum
Associates, Mahwah (2004)
9. Shaw, R.S., Chen, C.C., Harris, A.L., Huang, H.-J.: The impact of information richness on
information security awareness training effectiveness. Computers & Education 52, 92100
(2009)
10. Valentine, J.A.: Enhancing the Employee Security Awareness Model. Computer Fraud &
Security (6), 1719 (2006)
11. New York State Office of Cyber Security & Critical Infrastructure Coordination. Gone
Phishin, A Briefing on the Anti-Phishing Exercise Initiative for New York State Government.
Aggregate Exercise Results for public release
Abstract. We propose a new denition for searchable proxy reencryption scheme (Re-PEKS), dene the rst known searchable proxy
re-encryption scheme with a designated tester (Re-dPEKS), and then
give concrete constructions of both Re-PEKS and Re-dPEKS schemes
that are secure in the random oracle model.
Keywords: Searchable proxy re-encryption, public key encryption with
keyword search, random oracle model.
Introduction
Public key encryption with keyword search (PEKS) schemes enable searching of
keywords within encrypted messages. These schemes are desirable for mobile devices such as smartphones for selectively downloading encrypted messages from
gateways, e.g. accessing to emails while on mobile internet. Consider an e-mail
system that consists of three entities, namely, a sender (Bob), a receiver (Alice),
and a server (email gateway). Bob sends an encrypted message M appended
with some encrypted keywords w1 , ..., wn that are associated with the message
to the email gateway in the following format:
PKE(pkA , M )||PEKS(pkA , w1 )||...||PEKS(pkA , wn ),
where PKE is a standard public key encryption scheme and pkA is the public
key of Alice. Alice can give the email gateway a trapdoor associated with a
searching keyword w. The PEKS scheme enables the gateway to test whether w
is a keyword associated with the email but learns nothing else about the email.
The rst PEKS scheme was proposed by Boneh et al. in 2004 [6]. Baek et al.
[4] proposed a Secure Channel Free Public Key Encryption with Keyword Search
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 149160, 2010.
c Springer-Verlag Berlin Heidelberg 2010
150
151
Preliminaries
152
3.1
Denition
Security Model
The security of a Re-PEKS scheme requires that the adversary should not be
able to distinguish which keyword corresponds to a given ciphertext without the
trapdoor from the target receiver or a delegatee.
1
This algorithm becomes noninteractive if the input of skj is replaced by public key
pkj , where the delegatee does not involve in the generation of re-encryption key.
153
154
Construction
User X with private key x can delegate to user Y with private key y by selecting a
random r Zq and sending rx mod q to Y as well as r to proxy. Y sends y/rx mod
q to proxy. The proxy computes re-encryption key rkXY =(r)(y/rx) mod q=y/x
mod q. We assume that communications among proxy and users are via a secure
channel. In addition, the scheme makes no security guarantee if the proxy colludes
with either party [8].
155
Correctness:
Assume the PEKS ciphertext of keyword w is [A, B] = [g xr , H2 (e(g r , H1 (w )))]
1
and the trapdoor associated to keyword w is TX,w = H1 (w) x . If w = w ,
1
B = H2 (e(g r , H1 (w ))) = H2 (e(g xr , H1 (w ) x ) = H2 (e(A, TX,w )). It is easy to
verify that the correctness of the equation holds for multi-test as re-encrypted
ciphertext has the same form as the original ciphertext.
3.4
Security Analysis
i, pki , xi in LH .
Corrupted Receiver Key Generation: On input an index i, B selects
a random xi Zq , and outputs pki = g xi and ski = xi . It adds the tuple
i, pki , xi in LC .
2. Hash Function Queries: A can query the random oracle H1 or H2 any time.
H1 -query OH1 : B maintains an H1 -list with tuples
wn , hn , dn , cn which
is initially empty. On input wi , B responds as follows:
If the query wi is found in the H1 -list with an entry
wi , hi , di , ci ,
output H1 (wi ) = hi .
Otherwise, B selects a random di Zq and generates a random coin
ci {0, 1} so that Pr[ci = 0]=1/(qtd + 1).
If ci = 1, B computes hi = ud3i = (g )di .
If ci = 0, B computes hi = ud1i = (g )di
B adds the tuple
wi , hi , di , ci to H1 -list and returns H1 (wi ) = hi .
H2 -query OH2 : Similarly, B maintains an H2 -list with tuples
t, M
which is initially empty. On input t GT , B responds with H2 (t) = M .
For each new t, B responds to the H2 (t) query by selecting a new random
value M {0, 1}k and setting H2 (t) = M . B then adds the tuple
t, M
to the H2 list.
156
157
4.1
Denition
We dene searchable proxy re-encryption scheme with a designated tester (RedPEKS) and only consider bidirectional, multi-use Re-dPEKS.
Denition 3. (Bidirectional, Multi-use Re-dPEKS) A bidirectional, multiuse, searchable proxy re-encryption with a designated tester (Re-dPEKS) scheme
consists of the following algorithms:
GlobalSetup(1k ): On input a security parameter 1k , it returns a global parameter, GP.
KeyGenS (GP): On input GP, it returns a public-private key pair [pkS , skS ]
of the server S.
KeyGenR (GP): On input GP, it returns a public-private key pair [pkR , skR ]
of the receiver R.
ReKeyGen(GP, skRi , skRj ): On input GP, a private key skRi , and a private
key skRj , where i = j, it returns a re-encryption key rkRi Rj for receiver
Rj
dPEKS(GP, pkRi , pkS , w): On input GP, pkRi , pkS , and a keyword w, it
returns a dPEKS ciphertext Ci,w of w.
RedPEKS(GP, rkRi Rj , Ci,w ): On input GP, rkRi Rj , pkS , and an original
dPEKS ciphertext Ci,w , it returns a re-encryption dPEKS ciphertext Cj,w
of w for receiver Rj .
dTrapdoor(GP, pkS , skRi , w): On input GP, pkS , skRi , and a keyword w, it
returns a trapdoor Ti,w .
dTest(GP, skS , C, Tw ): On input GP, skS , a dPEKS ciphertext
C=dPEKS(GP, pkR , pkS , w ), and a trapdoor Tw =dTrapdoor(GP, pkS , skR ,
w), it returns 1 if w = w and 0 otherwise.
158
Correctness. Let key pairs [pkRi , skRi ] and [pkRj , skRj ] KeyGenR (GP),
rkRi Rj ReKeyGen(GP, skRi , skRj ), Ci,w dPEKS(GP, pkRi , pkS , w ),
Ti,w dTrapdoor(GP, pkS , skRi , w), Tj,w dTrapdoor(GP, pkS , skRj , w),
w, w keyword space KW, it holds that
dTest(GP, skS , Ci,w , Ti,w ) = 1 if w = w , and 0 otherwise.
dTest(GP, skS , RedPEKS(GP, rkRi Rj , Ci,w ), Tj,w ) = 1 if w = w , and 0
otherwise.
We note that our searchable proxy re-encryption scheme with a designated tester
(Re-dPEKS) is an extended searchable public key encryption scheme with a designated tester (dPEKS). In particular, we can add two algorithms, i.e., ReKeyGen
and RedPEKS, in a dPEKS scheme to form a Re-dPEKS scheme.
4.2
Security Model
Construction
159
1
H1 (w) x (ga )r
= H1 (w) x . If w =
(gr )a
a
H2 (e(g xr , H1 (w ) x ) = H2 (e(A, T a )). It is easy
have T =
w , B = H2 (e(g a , H1 (w )r )) =
References
1. Abdalla, M., Bellare, M., Catalano, D., Kiltz, E., Kohno, T., Lange, T., MaloneLee, J., Neven, G., Paillier, P., Shi, H.: Searchable encryption revisited: Consistency properties, relation to anonymous ibe, and extensions. In: Shoup, V. (ed.)
CRYPTO 2005. LNCS, vol. 3621, pp. 205222. Springer, Heidelberg (2005)
2. Ateniese, G., Fu, K., Green, M., Hohenberger, S.: Improved proxy re-encryption
schemes with applications to secure distributed storage. ACM Trans. Inf. Syst.
Secur. 9(1), 130 (2006)
3. Baek, J., Safavi-Naini, R., Susilo, W.: On the integration of public key data encryption and public key encryption with keyword search. In: Katsikas, S.K., L
opez, J.,
Backes, M., Gritzalis, S., Preneel, B. (eds.) ISC 2006. LNCS, vol. 4176, pp. 217232.
Springer, Heidelberg (2006)
4. Baek, J., Safavi-Naini, R., Susilo, W.: Public key encryption with keyword
search revisited. In: Gervasi, O., Murgante, B., Lagan`a, A., Taniar, D., Mun,
Y., Gavrilova, M.L. (eds.) ICCSA 2008, Part I. LNCS, vol. 5072, pp. 12491259.
Springer, Heidelberg (2008)
5. Blaze, M., Bleumer, G., Strauss, M.: Divertible protocols and atomic proxy cryptography. In: Nyberg, K. (ed.) EUROCRYPT 1998. LNCS, vol. 1403, pp. 127144.
Springer, Heidelberg (1998)
160
6. Boneh, D., Crescenzo, G.D., Ostrovsky, R., Persiano, G.: Public key encryption
with keyword search. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004.
LNCS, vol. 3027, pp. 506522. Springer, Heidelberg (2004)
7. Boneh, D., Franklin, M.K.: Identity-based encryption from the weil pairing. In:
Kilian, J. (ed.) CRYPTO 2001. LNCS, vol. 2139, pp. 213229. Springer, Heidelberg
(2001)
8. Canetti, R., Hohenberger, S.: Chosen-ciphertext secure proxy re-encryption. In:
Ning, P., di Vimercati, S.D.C., Syverson, P.F. (eds.) ACM Conference on Computer
and Communications Security, pp. 185194. ACM, New York (2007)
9. Crescenzo, G.D., Saraswat, V.: Public key encryption with searchable keywords
based on jacobi symbols. In: Srinathan, K., Rangan, C.P., Yung, M. (eds.) INDOCRYPT 2007. LNCS, vol. 4859, pp. 282296. Springer, Heidelberg (2007)
10. Gu, C., Zhu, Y., Pan, H.: Ecient public key encryption with keyword search
schemes from pairings. In: Pei, D., Yung, M., Lin, D., Wu, C. (eds.) Inscrypt 2007.
LNCS, vol. 4990, pp. 372383. Springer, Heidelberg (2008)
11. Hwang, Y.H., Lee, P.J.: Public key encryption with conjunctive keyword search and
its extension to a multi-user system. In: Takagi, T., Okamoto, T., Okamoto, E.,
Okamoto, T. (eds.) Pairing 2007. LNCS, vol. 4575, pp. 222. Springer, Heidelberg
(2007)
12. Park, D.J., Kim, K., Lee, P.J.: Public key encryption with conjunctive eld keyword
search. In: Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325, pp. 7386.
Springer, Heidelberg (2005)
13. Rhee, H.S., Park, J.H., Susilo, W., Lee, D.H.: Improved searchable public key
encryption with designated tester. In: Li, W., Susilo, W., Tupakula, U.K., SafaviNaini, R., Varadharajan, V. (eds.) ASIACCS, pp. 376379. ACM, New York (2009)
14. Rhee, H.S., Park, J.H., Susilo, W., Lee, D.H.: Trapdoor security in a searchable
public-key encryption scheme with a designated tester. Journal of Systems and
Software 83(5), 763771 (2010)
15. Rhee, H.S., Susilo, W., Kim, H.J.: Secure searchable public key encryption scheme
against keyword guessing attacks. IEICE Electronics Express 6(5), 237243 (2009)
16. Sahai, A., Waters, B.: Fuzzy identity-based encryption. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 457473. Springer, Heidelberg (2005)
17. Shao, J., Cao, Z., Liang, X., Lin, H.: Proxy re-encryption with keyword search. Inf.
Sci. 180(13), 25762587 (2010)
18. Zhang, R., Imai, H.: Generic combination of public key encryption with keyword
search and public key encryption. In: Bao, F., Ling, S., Okamoto, T., Wang, H.,
Xing, C. (eds.) CANS 2007. LNCS, vol. 4856, pp. 159174. Springer, Heidelberg
(2007)
Center of Excellence in Information Assurance (CoEIA), King Saud University, Saudi Arabia
2
Information Systems Department, College of Computer and Information Sciences,
King Saud University, Saudi Arabia
{meldefrawy,mkhurram,kalghathbar}@ksu.edu.sa
Abstract. Hash chains have been used as OTP generators. Lamport hashes have
an intensive computation cost and a chain length restriction. A solution for
signature chains addressed this by involving public key techniques, which
increased the average computation cost. Although a later idea reduced the user
computation by sharing it with the host, it couldnt overcome the length
limitation. The scheme proposed by Chefranov to eliminate the length
restriction had a deficiency in the communication cost overhead. We here
present an algorithm that overcomes all of these shortcomings by involving two
different nested hash chains: one dedicated to seed updating and the other used
for OTP production. Our algorithm provides forward and non-restricted OTP
generation. We propose a random challengeresponse operation mode. We
analyze our proposal from the viewpoint of security and performance compared
with the other algorithms.
Keywords: One Time Password, Lamport Hashing, Nested Hash Chains,
Authentications Factors.
1 Introduction
Authentication is used by a system to determine whether or not a given user is who
they claim to be. Authentication is the cornerstone of information security since a
weak authentication mechanism will cause the rest of the security to be fragile. It is
widely accepted that authentication uses one or more of the followings four factors [1]:
Password based authentication is the most widely used of the above four methods
because of its simplicity and low cost. A one-time password mechanism solves the
password security problem that could result from reusing the same password multiple
times.
The idea of one-way function chains or hash chains was first proposed by Lamport
[2]. This method is also referred to as S/Key OTP [3]. It is computationally intensive
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 161170, 2010.
Springer-Verlag Berlin Heidelberg 2010
162
on the user end; hence it uses backward and finite hash chains. The idea of using hash
chains was proposed to enhance OTP production compared to algorithms that require
time stamping with accurate time synchronization, e.g., RSA SecurID [4]. Since OTP
has been employed in a wide range of applications [5], it would be very beneficial to
overcome its insufficiencies. We propose a new algorithm that uses two different types
of hash functions, which come with a nested chain. The resulting chain provides the
forwardness and the infiniteness. Also we have reduced the number of exchanged
messages between the user and host. The proposed protocol is compared with others in
terms of its computational cost and security properties. The rest of this paper is
organized as follows: Section 2 discusses the related work, Section 3 proposes our new
algorithm, Section 4 analyzes the security attributes, and finally Section 5 concludes
the paper.
2 Related Work
Hash functions h () are useful in the construction of OTPs and are defined
straightforwardly as one-way hash functions (OWHFs) such that, given an input
string, x, it is easy to compute y = h( x) and, conversely, it is computationally
The idea of hash chains was first proposed by Lamport [2]. Later, it was implemented
to develop the S/Key OTP system [3]. It involves applying hash function h () for
N times to a seed ( s ) to form a hash chain of length N:
(1)
(2)
then the user calculates the t-th OTP according to this challenge
OTPt ( s ) = h N t ( s )
(3)
and the host authenticates the user by checking that the following equality holds:
h ( OTPt ( s ) ) = h N t +1 ( s )
(4)
where the value h N t +1 (s ) is already saved in the host systems password file from the
previous ( t 1) -th authentication. After any successful authentication, the system
password file is updated with the OTP that was saved before the host final hash
execution as h N t (s ) . In our case, the host then increments t by one and sends a new
challenge to the user for the next authentication. This scheme has a limitation on the
number of authentications, so that after reaching N authentications, a process restart is
required. In addition, it has a vulnerability because an opponent, impersonating the host,
163
can send a challenge with a small value to the user, who responds with the hash chain
initial values, which allow the intruder to calculate further OTPs [6]. This attack can be
referred to as a small challenge attack. In addition, the user computational requirements
are high, which makes the system unsuitable for devices with limited resources.
2.2 Goyal et al.s Scheme
In order to decrease the computational cost in [7]; Goyal et al. proposed a new idea of
dividing this large N value into N R sub periods of length R to share this cost with
the host itself. We will consider N to be a multiple of R to simplify the schemes
formula:
OTPN t ( s ) = OTPk + n ( s) = hn ( OTPk ( s) + s )
where
1
0
(5)
k mod R = 0
Otherwise
Suppose the user wishes to authenticate himself to the host for the t-th time.
n = ( N t ) mod R
n0
, and k = N t n
(6)
User calculates OTPk + n ( s ) and sends it back to the host as an OTP that is equal
to h n ( OTPk ( s ) + s ) .
Host stores the last OTP to hash the received OTP and compare it with the stored
one.
The next time user wants to login, he will be prompted with values of t + 1 .
The user must have a knowledge of s during every login, which makes it essential
to cipher the s seed. Re-initialization after N authentications is still necessary as in
Lamports scheme.
2.3 Bicakci et al.s Scheme
The infinite length hash chains (ILHC) proposed by [8] use a public-key algorithm, A ,
to produce a forward and infinite one way function (OWF). This OWF is the OTP
production core. Bicakci et al. proposed a protocol using RSA [9], where d is the
private key and e is the public key. The OTP originating from initial input s using
the RSA public-key algorithm for the t-th authentication is:
OTPt ( s ) = At ( s, d )
(7)
and the verification of the t-th OTP is done by decrypting OTPt ( s ) using e :
A ( OTPt ( s ) , e ) = OTPt 1 ( s ) .
(8)
164
Yeh et al.s scheme [11] is divided into three phases: registration, login, and
verification. In the registration phase, a user and a host set up a unique seed value
s and the number of login times, N. After setting up s and N, the user computes an
initial password OTP0 = h N ( K s ) , where h () is a hash function, K is a pass-phrase
of the user, and is a bitwise XOR function. The steady state authentication for the
t-th login time is shown in Table 1.
Registration Phase
User Host :
User Host :
N , s D0 , h ( D0 )
User Host :
OTP0 D0
Login Phase
User Host :
User Host :
( St .1 , St .2 , St .3 ) = ( N t , s Dt , h ( Dt ) OTPt 1 )
U t = h N t ( K s ) Dt = OTPt Dt
Verification Phase
Host :
h ( ( OTPt Dt ) Dt ) = OTPt 1
For the t-th login, the host sends a challenge to the user with a random
number, Dt , the Dt hashing value, h ( Dt ) , the shared secret, s and a stored value,
OTPt 1 . After validating the challenge, the user responds with OTPt Dt . Finally,
the host checks this response and replaces the stored value OTPt 1 with OTPt . This
scheme is vulnerable to a pre-play attack [12] because of transferring password
information in the both directions for the host and user. An attacker potentially has
the ability to impersonate the host to the user (by sending the user a forged
challenge). After that the attacker impersonates the user to the host using the new
valid password-authenticator sent to him previously by the user [12]. In addition, [6]
showed that Shen et al.s scheme is practically the same as that in [2], but uses the
closing of sensitive parts of transmitted messages with the help of the XOR
operation, a hash function, and a random nonce. Finally, the calculation of the hash
function numbers for the t-th login is equal to N t , from the user side. This shows
the algorithms computational cost. Again, re-initialization after N authentications is
still necessary.
165
The scheme proposed in [6] is divided into two complicated phases: the registration
phase and the login and authentication phase. In the following table, we show the
procedure for the first run.
Table 2. Chefranovs scheme
Registration Phase
Host :
Set n H = 0, generate random seed s H
User :
User Host :
sH
User Host :
sU
User Calculate:
User Host :
= h ( K s H ) , 1 = h ( )
p1 = h ( 1 )
User Host :
1 h ( D1U ) , h ( p1 ) h ( D1U ) ,
S1 = ( S1.1 , S1.2 , S1.3 , S1.4 ) = 2
h ( p ) D , s D
1
1U
U
1U
Host Check :
h ( S1.1 h ( sU S1.4 ) ) = p1
Host Calculate :
D2U = D1U + 1
p1 = S1.2 h ( D1U )
User Host :
User Check :
User Update :
User Host :
nH = nH + 1
2 = h ( 1 ) , nH = nH + 1, p2 = h ( 2 )
p2 to start the second session.
from the user side, and pt = h ( t ) from the host side. The author of this algorithm
claimed that the user is occupied by only four hash function calculations, but actually he
has to do more than this. Considering a steady state session, through the second phase,
the St vector itself has three hash operations, h( DtU ), h ( pt ) , and h 2 ( pt ) .
Two additional hash operations are required for the user updating of OTPs,
t +1 = h ( t ) and pt +1 = h ( t +1 ) , which must be done by the user itself. After that,
the user starts the ( t + 1) -th by sending pt +1 to the host in the registration phase.
166
3 Proposed Scheme
The idea behind our proposal is to expand Lamports scheme with some modifications that
produce the infiniteness and forwardness, avoiding the use of public key cryptography.
hB ()
s OTP
hA ()
h1A ( s OTP )
Fig. 1. One time password production considering a nested hash chain using two different
hashes
The shortcoming of those two parameters, infiniteness and forwardness, causes the
server vulnerabilities shown with respect to the previous work. Thus we need to
integrate Lamports scheme using two different one way hash functions, hA () and
hB () , one for the seed chain and the other for the OTPs production, as shown in Fig. 1.
hBy ( hAx ( s ) )
x:1 , y:1
(9)
Our algorithm can operate in two modes, offline authentication and challenge response
authentication. We consider the registration stage to be established manually between user
and host. The establishment of the algorithms and seeds requires manual intervention, e.g.,
the user should go personally to the host administrator to establish the system.
Table 3. The Proposed Scheme Notation.
Notation
hA ()
hB ()
s2Auth
t 1
s2Auth
t
stOTP
OTPt
( vt .1 , vt .2 )
Description
Represents the first hash function.
Represents the second hash function.
The authentication seed number 2t 1 for the t-th authentication.
The authentication seed number 2t for the t-th authentication.
The OTP seed number t for the t-th authentication.
The OTP number t for the t-th authentication.
The challenge vector for the t-th authentication sent by host to user.
wt
( xt , yt )
The response vector for the t-th authentication sent by user to host.
The nested hashing progress values for t-th authentication.
167
The user gets the two different hash functions established on his token plus two
different seeds, one for the OTP production, s1OTP , and the other one for the
authentication procedure, s1Auth , which are also installed on his token. To ensure that
information is completely shared with the service provider, these two seeds are
produced by the shared and unique parameters of the host and user.
3.2 Login and Authentication Phase
For the first time of the authentication process, the host sends a challenge to the user
of
(10)
functions, hA () and hB () , and also the authentication seed, s1Auth . Upon the
calculation of this value, the user can extract s1Auth from v2 and compare it with his
own. Unless positive results are obtained, the authentication process fails and the user
(
)
( h ( s )) .
terminates the procedure. Otherwise, the user calculates hB hA2 ( s1Auth ) to get
to compute OTP1 =
x1
B
y1
A
OTP
1
The
reaches the session OTP1 , the authentication seed is updated to s2Auth = hA2 ( s1Auth ) in
both the user and host.
Now, the user has to respond to this challenge with his OTP1 in the following form:
(OTP h ( h ( s )))
1
Auth
2
(11)
The host calculates hB hA ( s2Auth ) to extract the received OTP1 and compares it
with his own OTP1 , he has calculated by himself. Then, both the user and host
updated to s3Auth = hA ( s2Auth ) . Those two values s2OTP and s3Auth are the next session
initial seeds. This is the procedure for a complete session; the general steps are
shown in Table 4.
168
Registration Phase
User Host :
User :
check
Auth
Auth
( vt .1 , vt .2 ) = ( ( xt , yt ) hB ( hA2 ( s2Auth
t 1 ) ) , s 2 t 1 hB ( h A ( s 2 t 1 ) ) )
Host :
check
update
calculate OTPt = h
yt
B
User Host :
Auth
2
Auth
vt .2 hB hA ( s2Auth
t 1 ) = s2 t 1 extract ( xt , yt ) = vt .1 hB hA ( s2 t 1 )
Auth
2 t +1
xt
A
OTP
t
Auth
2t
Auth
2t
Auth
2t
2
A
Auth
2 t 1
OTP
t +1
xt
A
OTP
t
Auth
2t
4 Security Analysis
Naturally, the proposed scheme can resist an off-line guessing attack because it uses
strong passwords of strong hash functions. Moreover, the gaining of unauthorized
access by replaying reusable passwords is restricted by encoding passwords, which
are used once. Also our proposal does not require time synchronization [13] for a
certain reference between the user and host, which is not easy to apply in many
applications, i.e., mobile phones that come with synchronization leaks considering
roaming to different time zones. It is necessary to prevent having another token as the
OTP generator for a certain account, if it already has an active generator [14]. A
manual process should handle this situation. In this section, we will briefly give a
security assessment of our proposed scheme [15], [16], [17].
4.1 Pre-play Attack
169
increases the robustness against this type of attack. The two exchanged vectors
between user and host are transferred in a cipher format.
4.2 Insider Attack
If a host insider tries to impersonate the user to access other hosts using the shared
OTPs between them, s/he will not be able to do so because the cooperation of the
OTPs seed fabrication between this user and the different hosts is strong.
Furthermore, as the OTP production, using two different types of strong hashes,
hA () and hB (), is strong, the host insider cant derive those OTPs by performing an
off-line guessing attack on what he has received.
4.3 Small Challenge Attack
5 Conclusions
A new one time password scheme based on forward hashing using two different
nested hashes has been presented. These two hashes provide the authentication seed
updating and the OTP production. This scheme achieves better characteristics than the
other schemes discussed. Our proposal is not limited to a certain number of
authentications, unlike the mentioned OTP hashing based schemes, and also does not
involve computationally expensive techniques to provide the infiniteness. A security
analysis was also performed that covered some types of attacks that could influence
our scheme.
References
1. Kim, H., Lee, H., Lee, K., Jun, M.: A Design of OneTime Password Mechanism Using
Public Key Infrastructure. In: Networked Computing and Advanced Information Management,
vol. 1, pp. 1824 (2008)
2. Lamport, L.: Password Authentication with Insecure Communication. Comm. ACM 24(11),
770772 (1981)
3. Haller, N.: The S/KEY OneTime Password System. In: Proceedings of the ISOC
Symposium on Network and Distributed System Security, pp. 151157 (1994)
4. RSA SecurID, http://www.rsa.com/node.aspx?id=1156
(Accessed: May 04, 2010)
5. Rivest, R., Shamir, A.: Payword and micromint: Two simple micropayment schemes,
pp. 711 (1996)
6. Chefranov, A.: OneTime Password Authentication with Infinite Hash Chains, Novel
Algorithms and Techniques. In: Tele-Communications, Automation and Industrial Electronics,
pp. 283286 (2008)
170
7. Goyal, V., Abraham, A., Sanyal, S., Han, S.: The N/R one time password system. In:
Proceedings of International Conference on Information Technology: Coding and Computing
(ITCC 2005), vol. 1, pp. 733738 (2005)
8. Bicakci, K., Baykal, N.: Infinite length hash chains and their applications. In: Proceedings
of 1th IEEE Int. Workshops on Enabling Technologies: Infrastructure for Collaborating
Enterprises (WETICE 2002), pp. 5761 (2002)
9. Rivest, R., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public
key cryptosystems. Communications of the ACM (1978)
10. Khan, M., Alghathbar, K.: Cryptanalysis and Security Improvements of TwoFactor User
Authentication in Wireless Sensor Networks. In: Sensors, vol. 10(3), pp. 24502459
(2010)
11. Yeh, T., Shen, H., Hwang, J.: A secure onetime password authentication scheme using
smart cards. IEICE Trans. in Commun. E85B(11), 25152518 (2002)
12. Yum, D., Lee, P.: Cryptanalysis of YehShenHwangs onetime password authentication
scheme. IEICE Trans. Commun. E88B(4), 16471648 (2005)
13. Aloul, F., Zahidi, S., ElHajj, W.: Two factor authentication using mobile phones. In:
IEEE/ACS International Conference on Digital Object Identifier, pp. 641644 (2009)
14. Raddum, H., Nests, L., Hole, K.: Security Analysis of Mobile Phones Used as OTP
Generators. In: IFIP International Federation for Information Processing, pp. 324331
(2010)
15. Khan, M.K.: Fingerprint Biometricbased Self and Deniable Authentication Schemes for
the Electronic World. IETE Technical Review 26(3), 191195 (2009)
16. Khan, M.K., Zhang, J.: Improving the Security of A Flexible Biometrics Remote User
Authentication Scheme. In: Computer Standards and Interfaces (CSI), vol. 29(1), pp. 84
87. Elsevier Science, UK (2007)
17. Eldefrawy, M.H., Khan, M.K., Alghathbar, K., Cho, E.-S.: Broadcast Authentication for
Wireless Sensor Networks Using Nested Hashing and the Chinese Remainder Theorem.
Sensors 10(9), 86838695 (2010)
18. Mitchell, C., Chen, L.: Comments on the S/KEY user authentication scheme. ACM Operating
System Review 30(4), 1216 (1996)
1 Introduction
A Cyber-Physical System (CPS) is the system of computer systems which is usually
interacted with the physical world in real-time. These systems have to sense constraints imposed by dynamically changing environment and predictably react to these
changes and it has physical world and logical world [1]. Optimal Intelligent Supervisory Control System (OISCS) that it studies offers location privacy by hash function
when they use them in cyber-physical intelligence. OISCS has three layers. Smart
Construction Environment (SCE) lies in bottom layer, IT layer (ITL) is put middle
layer of between bottom layer and upper layer. The upper layer has User Devices
(UD). SCE defines SCADA system; ModBUS protocol is going to use to exchange
information of each device [2]. ITL helps interact of all devices in SCE, ModBUS
also processes based on ITL. In UD layer, user monitors the flowing data in ITL. In
this case, due to users remotely monitor out of home/office, they can be opened their
location. Therefore, it has to study location privacy [3][4]. OISCS that it proposes
provides location privacy by hash function when users use and supervisory control
system to SCADA monitor and control. This paper is consists of, it explains the
OISCS model in section 2; section 3 has the explanation about location privacy. Section 4 is for simulation of OISCS, and it describes the discussion into section 5. Finally, the conclusion for this paper put in section 6.
172
H. Ko and Z. Vale
2.1 Overview
Cyber-Physical System (CPS) is the system of computer systems which is usually
interacted with the physical world in real-time. It has physical world and logical
world, physical world means building, urban space, network devices etc, and logical
world or cyber world means computational services in ITL [1]. Electronic information
processed by SCADA which operates in physical world normally transmits to users
through ITL, to data transmission between each device in SCADA; it uses ModBUS
[2]. The Figure 1 shows internal structure of OISCS, the model is set by organic
processing of all objects in service information.
Monitor: All energy flows which processed in home/office are exchanged by ModBUS, which usually gives and takes the information between electronic devices. It
sends to Content Management (CM), and it regularly forwards UD through System
Task Management (STM) which is in upper.
Control: Smart Object (SO) takes all intelligent processes; finally it can do active
process according to users information move or changes of networks or environment
information around users. After users begin the monitoring, the signals are detected
by Human Agent (HA), SO should recognize them, then it forwards to CM through
Service Information (SI). This asking goes through CPS environment, and sends to
Smart Energy System (SES).
2.2 Location Privacy
When users use remote OISCS, if users opened their current location, then all attackers absolutely attack the users. Therefore, it is necessary to use the location privacy
[2][4]. At present, the users usually use mobile device and network which uses IP
policy in OISCS. All devices of users usually has id, pd, k, ip, x, y, h, t, s. That is, the
information that will be saved into UD is next,
Ud1={id, pd, sk, ip, x, y, h, t, s} S{|id|,|pd|,|k|,|ip|, t, s}
However, if the user sends upper message without any protection process, some
attacker can steal them and get Du1. Therefore, it is necessary to send them with
encryption.
Ud1 S {Ek(id, pd, sk, ip, t, s)}
Ud1: Dk {Ek(id, pd, sk, ip, t, s)}
This encryption method provides us the data security. However, it sometimes offers
location privacy also for the attackers dont know the exact owner of the captured
information.
User logical location privacy: It means protection for users devices, it provides that
services by encrypting the id, pd, ip.
173
at
es
In
iti
Se
rv
ic
es
User physical location privacy: Physical location means users/devices real location
by x, y, h. To protect of user physical location, it is forbid to send x, y, h in OISCS.
174
H. Ko and Z. Vale
Table 1. Notation
Human Agent: It takes a process as user gateway which makes users profile in CPS;
it defines real-time processes for user in system, the amount used of energy [6].
ER (Energy Resource) Agent: It defines information about ER, ex., ER ID.#, Energy Types (Solar cells, Micro Turbines, Fuel cells etc), Power Rate (kW), a region fuel
175
Notation
Ud
E/D
h
id
pd
k
ip
x
y
h
t
sn
Apart from Table 2 analyses, SCADA security system checks the area of control,
evaluation and monitor.
176
H. Ko and Z. Vale
3.1 Security
Key Exchange Module
Because session key (k) is static status, normally it feels dangerous to be opened by
attacks. Therefore, it needs dynamic key modules; it shares with other user like below
notation. First, hash function processes id, password and t which define request time.
If it needs stronger hash value, it uses ip of devices to hash.
HV = S {H (id||pd||t||(ip))};
GECAD
GECAD
GECAD
(a) Home
(b) Room
(c) Monitor
177
Figure 4 shows OISCS modules. (a) Home shows the whole area of home. If a
user clicks (or touches) each area, the user can see the next steps (b). In (b) Room,
the user selected (b) Room. That room has some electronic devices. The user monitors all devices status of the room. According to each area, all users have to click or
touches the area, then the device show the next steps to the user. (c) Monitor shows
the energy flows. The users can know the amount used energy in real time through
(c) Monitor.
4 Conclusion
OISCS is the system model which can detect, control and monitor all devices in
SCADA home or SCADA office over user devices in remote area. To be secure user /
devices, it added hash functions to generate dynamic encrypt key into OISCS, also, it
suggested the encrypt/decrypt with dynamic generating hash value to transferring
data. In future, it needs full implementation based on OISCS.
Acknowledgement
This work is partially supported under the support of the Portuguese Foundation for
Science and Technology (FCT) in the aims of Cincia 2007 program for the hiring of
Post-PhD researchers.
References
1. Lai, C.-F., Ma, Y.-W., Chang, S.-Y., Chao, H.-C., Huang, Y.-M.: OSGi-based services architecture for Cyber-Physical Home Control Systems. Computer Communications, 18, in
Press, Corrected Proof, Available online April 2, 2010 (2010)
178
H. Ko and Z. Vale
2. Cagalaban, G.A., So, Y., Kim, S.: SCADA Network Insecurity: Securing Critical Infrastructures through SCADA Security Exploitation. Journal of Security Engineering 6(6), 473482
(2008)
3. Ko, H., Freitas, C., Goreti, M., Ramos, C.: A study on Users Authentication Methods for
Safety Group Decision System in Dybanic Small Group. Journal of Convergence Information Technology 4(4), 6876 (2009)
4. Krumm, J.: A survey of computational location privacy. Pervasive Ubiquitous Computing (13), 391399 (2009)
5. Bradley, N.A., Dunlop, M.D.: Toward a Multidisciplinary Model of Context to Support
Context-Aware Computing. Human-Computer Interaction 20, 403446 (2005)
6. Pipattansaomporn, M., Feroze, H., Rahman, S.: Multi-Agent Systems in a Distributed Smart
Grid: Design and Implementation. In: IEEE PES2009 Power Systems Conference and Exposition (PSCE 2009), Seattle, Washington, USA, pp. 17 (March 2009)
7. Mukherjee, T., Banerjee, A., Varsamopoulos, G., Gupta, S.K.S., Rungta, S.: Spatiotemporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers. Computer Networks 53(17), 28882904 (2009)
Abstract. In todays society the demand for reliable verification of a user identity
is increasing. Although biometric technologies based on fingerprint or iris can
provide accurate and reliable recognition performance, they are inconvenient for
periodic or frequent re-verification. In this paper we propose a hip-based user
recognition method which can be suitable for implicit and periodic re-verification
of the identity. In our approach we use a wearable accelerometer sensor attached
to the hip of the person, and then the measured hip motion signal is analysed
for identity verification purposes. The main analyses steps consists of detecting
gait cycles in the signal and matching two sets of detected gait cycles. Evaluating
the approach on a hip data set consisting of 400 gait sequences (samples) from
100 subjects, we obtained equal error rate (EER) of 7.5% and identification rate
at rank 1 was 81.4%. These numbers are improvements by 37.5% and 11.2%
respectively of the previous study using the same data set.
Keywords: biometric authentication, gait recognition, hip motion, accelerometer sensor, wearable sensor, motion recording sensor.
1 Introduction
Nowadays the demand for reliable verification of user identity is increasing. Although
traditional knowledge-based authentication can be easy and cheap in implementation,
it possesses usability limitations. For instance, it is difficult to remember long and random passwords/PIN codes, and manage multiple password. Moreover, password-based
authentication is merely verifies that the claiming person knows the secret and it does
not verify his/her identity per se. In the contrary, biometric authentication, which is
based on physiological or behavioural characteristics of human being, establishes an
explicit link to the identity. Thus it provides more reliable user authentication compared
to password-based mechanisms. Conventional biometric modalities like fingerprint or
iris can provide reliable and accurate user authentication. However, in some application
scenarios they can be inconvenient. For instance, for periodic or frequent re-verification
of the identity. We define identity re-verification as a process of assuring that user of
the system is the same as who was previously authenticated. It is worth noting that the
initial authentication mechanism can be different from the re-verification method. For
example in a mobile phone user case, for the first time authentication the user can use a
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 179186, 2010.
c Springer-Verlag Berlin Heidelberg 2010
180
fingerprint or PIN code but then for the subsequent re-verification implicit methods can
be applied, e.g. speaker recognition (when talking on phone) or gait recognition (when
walking).
Gait biometrics or gait recognition refers to verification and identification of the
individuals based on their walking style. The advantage of gait biometrics is that it can
provide an unobtrusive and implicit way of recognizing a person which can be very
suitable in periodic re-verification. From a technological perspective (i.e. the way how
gait is collected), gait recognition can be categorized into three approaches [1]:
Video Sensor (VS) based,
Floor Sensor (FS) based,
Wearable Sensor (WS) based.
In the VS-based gait approach, gait is captured by using a video-camera and then
image/video processing techniques are applied to extract gait features for recognition
[2]. In the FS-based approach, a set of sensors are installed in the floor, and gait related
features are measured when person walks on them [3,4,5]. In the WS-based approach,
motion recording sensors (e.g. accelerometer, gyro sensors, etc.) are worn or attached to
various locations on the body of the person such as shoe [6], waist [7], arm [8] etc. The
motion signal recorded by the sensors is then utilized for person recognition purposes.
Most of the research in the area of gait recognition is focused on VS-based gait
recognition [2]. Recently, WS-based approach is also gaining research focus [1,9,10].
Nevertheless, currently most of the WS-based studies are based on relatively small data
sets. In this paper we present a WS-based gait recognition which is based on a relatively
large data set (100 subjects in the experiment). In our approach we use an accelerometer
sensor attached to the hip of the person, and the measured hip motion signal is analysed
for identity verification purposes. The rest of the paper is organized as follow. Section 2
describes the experiments and data collection technology. Section 3 presents the gait
recognition method. Section 4 contains the results of performance evaluation. Section 5
concludes the paper.
(a) MRS
181
3 Recognition Method
Our gait recognition method consists of several steps which essentially include: a) preprocessing of the acceleration signal, b) detecting cycles in the signal and c) matching
detected cycles.
Pre-processing and cycle detection: The MRS can output acceleration in three directions, namely up-down, forward-backward and sideways. From three acceleration we
computed a resultant signal and use it for analyses. The resultant acceleration is calculated as follows:
(1)
ri = x2i + yi2 + zi2 , i = 1, ..., k
where ri , xi , yi and zi are the magnitudes of resultant, up-down, forward-backward
and sideway acceleration at observation point i, respectively, and k is the number of
recorded observations in the signal. Then after some pre-processing of the signal (i.e.
time normalization and noise reduction) we start to search for gait cycles in the signal.
Figure 2 presents an example of acceleration signal with detected cycles. Few cycles
in the beginning and ending of the signal are omitted since they may not represent the
natural gait of the person [11]. More information on pre-processing and cycle detection
steps can be found in [12].
Matching cycles: Once the gait cycles have been identified we conduct a cycle comparison process. In our previous study of the hip data set, we used an average cycle as a
feature vector which was computed by combining normalized cycles into one [12]. In
this paper instead of computing an average cycle we rather conduct cross comparison
between two sets of cycles to find the best matching cycle pair. Assuming two sets of
E
F
} and C F = {C1F , ..., CN
}, where each cycle in
detected cycles are C E = {C1E , ..., CM
E
E
E
E
E
the set consists of 100 acceleration value, e.g. C1 = {C1(1) , C1(2) , ..., C1(99)
, C1(100)
}.
182
If this two sets are from the same persons hip motion then comparison is referred as
genuine matching otherwise (i.e. different persons) comparison is referred as impostor
matching. We compare each cycle in set C E to every cycle in set C F by calculating
their similarity using Euclidean distance, as follow.
100
E
F
E C F )2
SimScore(Ck , Cp ) = (Ck(i)
(2)
p(i)
i=1
1.2
1.0
0.6
0.8
Acceleration, g
1.4
1.6
10
15
Time, sec
183
60
0.95
80
1.00
The resulting DET and CMC curves using cycle matching (this paper) and the average cycle (described in [12]) method are shown in Figure 3. In addition, Table 1 presents
the identification rates on ranks 1-5 and equal error rate (EER) of the methods, and the
obtained improvements. The EER is a point in the DET curve where FAR=FRR.
40
FRR, %
0.90
Averaged cycle
Matching cycles
0.85
Identification probability
Averaged cycle
Matching cycles
20
0.80
0.75
EER
10
20
30
40
FAR, %
50
60
20
40
60
80
100
rank
Fig. 3. Recognition performances on verification (DET curve) and identification (CMC curve)
modes
Table 1. Single value performance metrics for our previous method in [12] and proposed method
in this paper. Numbers are given in %.
Performance metric Method in [12] This paper Improvement
Identification rate at rank 1
73.2
81.4
11.2
Identification rate at rank 2
78.1
87.9
12.5
Identification rate at rank 3
80.3
90.4
12.6
Identification rate at rank 4
82.1
91.9
11.9
Identification rate at rank 5
83.3
92.8
11.4
Equal Error Rate (EER)
12
7.5
37.5
184
MRS
ment
shoe
shoe
shoe
place- Performance, %
#S
foot
pocket
wrist
waist
waist
waist
waist
hip
Rr = 97.4
10
Rr = 96.93
9
F RR = 6.9 at F AR = 10
0.02
EER = { 5-18.3 }
30
EER = { 7.3-20 }
50
EER = { 10-15 }
30
EER = 6.4
36
EER={ 7-19 }
36
EER = 6.7
35
EER = {5.6, 21.1 }
21
EER = 1.6
60
hip
hip
Rr = 93.1
EER = 7.5 P1 = 81.4
6
100
accuracies can be in combining WS-based gait recognition with other biometric modalities like in Vildjiounaite et al. [14], or combining motion information from different
body location like in Pan et al. [9].
5 Conclusion
In this paper we presented a hip-based individual recognition method. Hip motion is
collected by using an accelerometer attached to the belt of the subjects. The recorded
hip motion is then analyzed for person recognition purposes. Using 400 hip motion samples from 100 subjects, we obtained an EER and identification rate at rank 1 of 7.5%
and 81.4%, respectively. Such type of user authentication can be suitable for periodic or
continous re-verification of the identity thanks to its unobtrisiveness and implicity. Although obtained results are promising, further work is required to improve recognition
accuracy under the influence of challenging factors like walking on different surface,
speed, shoe types etc. in gait recognition.
Acknowledgment
We would like to thank Professor Einar Snekkenes for providing motion recording sensor for our data collection.
References
1. Gafurov, D., Snekkenes, E.: Gait recognition using wearable motion recording sensors.
EURASIP Journal on Advances in Signal Processing (2009); Special Issue on Recent Advances in Biometric Systems: A Signal Processing Perspective
185
2. Nixon, M.S., Tan, T.N., Chellappa, R.: Human Identification Based on Gait. Springer, Heidelberg (2006)
3. Middleton, L., Buss, A.A., Bazin, A., Nixon, M.S.: A floor sensor system for gait recognition.
In: Fourth IEEE Workshop on Automatic Identification Advanced Technologies (AutoID
2005), pp. 171176 (2005)
4. Suutala, J., Roning, J.: Towards the adaptive identification of walkers: Automated feature selection of footsteps using distinction sensitive LVQ. In: Int. Workshop on Processing Sensory
Information for Proactive Systems (PSIPS 2004), June 14-15 (2004)
5. Jenkins, J., Ellis, C.S.: Using ground reaction forces from gait analysis: Body mass as a weak
biometric. In: International Conference on Pervasive Computing (2007)
6. Morris, S.J.: A shoe-integrated sensor system for wireless gait analysis and real-time therapeutic feedback. PhD thesis, Harvard UniversityMIT Division of Health Sciences and Technology (2004), http://hdl.handle.net/1721.1/28601
7. Ailisto, H.J., Lindholm, M., Mantyjarvi, J., Vildjiounaite, E., Makela, S.-M.: Identifying people from gait pattern with accelerometers. In: Proceedings of SPIE. Biometric Technology
for Human Identification II, vol. 5779, pp. 714 (2005)
8. Gafurov, D., Snekkenes, E.: Arm swing as a weak biometric for unobtrusive user authentication. In: IEEE International Conference on Intelligent Information Hiding and Multimedia
Signal Processing (2008)
9. Pan, G., Zhang, Y., Wu, Z.: Accelerometer-based gait recognition via voting by signature
points. Electronics Letters (2009)
10. Sprager, S., Zazula, D.: Gait identification using cumulants of accelerometer data. In: 2nd
WSEAS International Conference on Sensors, and Signals and Visualization, Imaging and
Simulation and Materials Science (2009)
11. Alvarez, D., Gonzalez, R.C., Lopez, A., Alvarez, J.C.: Comparison of step length estimators
from weareable accelerometer devices. In: 28th Annual International Conference of the IEEE
on Engineering in Medicine and Biology Society (EMBS), pp. 59645967 (August 2006)
12. Gafurov, D., Snekkenes, E., Bours, P.: Spoof attacks on gait authentication system. IEEE
Transactions on Information Forensics and Security 2(3) (2007); Special Issue on Human
Detection and Recognition
13. Jonathon Phillips, P., Moon, H., Rizvi, S.A., Rauss, P.J.: The FERET evaluation methodology for face-recognition algorithms. IEEE Transactions on Pattern Analysis and Machine
Intelligence 22(10), 10901104 (2000)
14. Vildjiounaite, E., Makela, S.-M., Lindholm, M., Riihimaki, R., Kyllonen, V., Mantyjarvi, J.,
Ailisto, H.: Unobtrusive multimodal biometrics for ensuring privacy and information security
with personal devices. In: Fishkin, K.P., Schiele, B., Nixon, P., Quigley, A. (eds.) PERVASIVE 2006. LNCS, vol. 3968, pp. 187201. Springer, Heidelberg (2006)
15. Bufu, H., Chen, M., Huang, P., Xu, Y.: Gait modeling for human identification. In: IEEE
International on Conference on Robotics and Automation (2007)
16. Yamakawa, T., Taniguchi, K., Asari, K., Kobashi, S., Hata, Y.: Biometric personal identification based on gait pattern using both feet pressure change. In: World Automation Congress
(2008)
17. Gafurov, D., Snekkenes, E.: Towards understanding the uniqueness of gait biometric. In:
IEEE International Conference Automatic Face and Gesture Recognition (2008)
18. Gafurov, D., Snekkenes, E., Bours, P.: Gait authentication and identification using wearable
accelerometer sensor. In: 5th IEEE Workshop on Automatic Identification Advanced Technologies (AutoID), Alghero, Italy, June 7-8, pp. 220225 (2007)
19. Mantyjarvi, J., Lindholm, M., Vildjiounaite, E., Makela, S.-M., Ailisto, H.J.: Identifying
users of portable devices from gait pattern with accelerometers. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (2005)
186
20. Rong, L., Zhiguo, D., Jianzhong, Z., Ming, L.: Identification of individual walking patterns
using gait acceleration. In: 1st International Conference on Bioinformatics and Biomedical
Engineering (2007)
21. Rong, L., Jianzhong, Z., Ming, L., Xiangfeng, H.: A wearable acceleration sensor system
for gait recognition. In: 2nd IEEE Conference on Industrial Electronics and Applications
(ICIEA) (2007)
22. Bours, P., Shrestha, R.: Eigenstep: A giant leap for gait recognition. In: 2nd International
Workshop on Security and Communication Networks (2010)
Abstract. This paper presents a robust and efficient face recognition technique. It
consists of six major steps. Initially the proposed techniques extract salient facial
landmarks (eyes, mouth, and nose) automatically from each face image. SIFT
descriptor is used to determine all facial keypoints from each landmark. These
feature points are considered to obtain a relaxation graph. Given two relaxations
graphs from a pair of faces, matching scores between two corresponding feature
points has been determined with the help of iterative graph relaxation cycles.
Dempster-Shafer decision theory is used to fuse all these matching scores for
taking the final decision. The proposed technique has been tested against the three
databases, namely, FERET, ORL and IITK face databases. The experimental
results exhibit robustness of the proposed face recognition system.
Keywords: Face recognition, SIFT features, Graph relaxation, Dempster-Shafer
theory.
1 Introduction
Face recognition can be considered as one of the most dynamic and complex research
areas in machine vision and pattern recognition [1, 2] due to the variable appearance of
face images. Changes in appearance may occur due to many factors, such as facial
attributes compatibility complexity, the motion of face parts, facial expression, pose,
illumination and partly occlusion. As a result face recognition problem become ill-posed.
There exist several techniques to similar faces with identical characteristics from a
set of faces and to accelerate the face matching strategy. These techniques can broadly
*
Corresponding author.
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 187196, 2010.
Springer-Verlag Berlin Heidelberg 2010
188
Fusion of Moving and Static Facial Features for Robust Face Recognition
189
50
50
100
100
150
150
200
200
50
100
150
200
50
100
150
200
190
v Ri = arg max P ( i
j , v Q V Q
v Qj
| K R , R , K Q , Q )
(1)
For efficient searching of matching probabilities from the query sample, we use
relaxation technique that simplifies the solution of matching problem. Let Pijv denote
the matching probability for vertices viR VR and vQj VQ. Now, by reformulating
Equation (1) one gets
v Ri = arg max
j , v Q j V Q
Pij v
(2)
Equation (2) consider as an iterative relaxation algorithm for searching the best labels
v
for v-iR. This can be achieved by assigning prior probability Pij proportional
Fusion of Moving and Static Facial Features for Robust Face Recognition
v
to sij
191
Pijv ;
P ij v . Q
P ij v =
j ,v
j
Q
ij
P ij v . Q
(3)
ij
Q ij = Pij v
v i V R
e
ij
. Pij v
(4)
v j V Q
In Equation (4), Qij conveys the support of the neighboring vertices and
Pijv represents the posteriori probability. The relaxation cycles are repeated until the
v
v
difference between prior probability P and posteriori probabilities P becomes
ij
ij
smaller than certain threshold and when this would be reached only then it is
assumed that the relaxation process is stable. When such a condition holds good, it is
assume that relaxation process is stable.
Hence, the best match graph for query sample is established by using the posteriori
probabilities in Equation (3).
right eye
Let
,
,
and
be the individual matching scores
obtained from the four different matching of salient facial landmarks. Now, in order
to obtain the combine matching score determined from the four salient landmarks
pairs, Dempster combination rule has been applied. First, we combine the matching
scores obtained from the pairs of left-eye and nose landmark features and in the next,
we combine the matching scores obtained from the pairs of right-eye and mouth
landmark features. Finally, we combine the matching scores determined from the first
nose
mouth
Also, let m(
left eye
192
left eye
) m (
nose
)=
eye
left
) m ( nose )
left eye
m (
nose
left eye
) m ( nose )
m (
right eye
m (
right eye
mouth
(5)
, C 2 .
(6)
, C .
(7)
) m ( mouth )
right eye
) m ( mouth )
m ( m (C
m ( C1 ) m ( C 2 ) = C
, C1 .
m ( C ) = m ( m ( C 1 )) m ( m ( C 2 )) =
left eye
nose = C 1
m (
)) m ( m ( C 2 ))
m ( m (C
m ( C1 ) m ( C 2 ) =
)) m ( m ( C 2 ))
The denominator in equations (5), (6) and (7) are the normalizing factors which
denotes the art of Belief probability assignments
(8)
where denotes the Dempster combination rule. The final decision of user
acceptance and rejection can be established by the following equation and by
applying the threshold
accept , if m ( FMS )
decision =
reject , otherwise
(9)
6 Experimental Results
To investigate the effectiveness and robustness of the proposed graph-based face
matching strategy, experiments have been carried out on the three face databases,
namely FERET [24], ORL [11] and IITK face databases.
6.1 Evaluation with FERET Face Database
The FERET face recognition database [24] is a collection of face images acquired by
NIST. For this evaluation, 1,396 face images are considered as training dataset out of
which 200 images labeled as bk; that is 1396 face images in database. For query set
Fusion of Moving and Static Facial Features for Robust Face Recognition
193
we have considered 1195 images that are labeled as fafb. All these images have been
downscaled to 140x100 from the original size of 150x130. For testing purpose, we
take fa labeled dataset of 1,195 and the duplicate 1 dataset of 722 face images as
probe set. Prior to processing, the faces are well registered to each other and the
background effects are eliminated. Moreover, only the frontal view face images are
used, which have natural facial expressions (fa) and the face images which have taken
under different lighting conditions.
The result obtained from the FERET dataset given in the Tab the recognition
accuracy of the proposed system when tested on FERET dataset found to be 92.34%.
Consequently, our proposed result proved to be an appropriate one for changing
illumination and facial expression. In addition, use of invariant SIFT features along
with the graph relaxation topology has made this system truly robust and efficient.
6.2 Evaluation with IIT Kanpur Database
The IITK face database consists of 1200 face images with four images per person
(300X4). These images are captured under control environment with 20 degree
changes of head pose and with at most uniform lighting and illumination conditions,
and with at most consistent facial expressions. For the face matching, all probe
images are matched against all target images.
From the ROC curve in Fig. 2 it has been observed that the recognition accuracy is
93.63%, with the false accept rate (FAR) of 5.82%.
ROC Curves
100
95
90
85
80
75
70
10
10
194
(smile/not smile, open/closed eyes). The original resolution of the images is 92 x 112
pixels. However, for the experiment the resolution is set to 120160 pixels.
From the ROC curve in Fig. 2 it has been observed that the recognition accuracy
for the ORL database is 97.33%, yielding FAR is about 2.14%. The relative accuracy
of the proposed matching strategy for ORL database increases of about 3% and 5%
over the IITK database and the FERET database respectively.
6.4 Comparison with Other Techniques
In order to verify the effectiveness of the proposed face matching algorithm for
recognition and identification, we have compared it with the algorithms discussed in
[6, 13, 14, 15]. It has been observed that the proposed algorithm is completely
different from the algorithms discussed in [6, 13, 14, 15] in terms of performance and
design issues. In [13], the PCA approach has been discussed for different view of face
images without transformation and the algorithm has achieved 90% recognition
accuracy for some specific views of faces. On the other hand, [14] and [15] have used
gabor jets for face processing and recognition, where the algorithm in [14] makes use
of the gabor jets without transformation and later one used the gabor jets with
geometrical transformation. Both the techniques have been tested on Bochum and
FERET databases [24]. These databases are characteristically different from IITK and
ORL face databases [11] and the maximum achieved recognition rates are 94% and
96%, respectively. Further graph based face recognition techniques drawn on SIFT
features have considered in [6] the whole face whereas the proposed face recognition
algorithm has not only determined keypoints from the local landmarks, but also it
combines the local features for robust performance.
7 Conclusion
In this paper, an efficient and robust face recognition technique by considering facial
landmarks and using the probabilistic graph relaxation drawn on SIFT feature points
has been proposed. During the face recognition process, the human faces are
characterized on the basis of local salient landmark features. It has been determined
that when the face matching accomplishes with the whole face region, the global
features (whole face) are easy to capture and they are generally less discriminative
than localized features. In the proposed face recognition method, local facial
landmarks are considered for further processing. The optimal face representation
using graph relaxation drawn on local landmarks then allows matching the localized
facial features efficiently by searching the correspondence of keypoints using iterative
relaxation.
References
1. Shakhnarovich, G., Moghaddam, B.: Face Recognition in Subspaces. In: Li, S., Jain, A.
(eds.) Handbook of Face Recognition, pp. 141168. Springer, Heidelberg (2004)
2. Shakhnarovich, G., Fisher, J.W., Darrell, T.: Face Recognition from Long-term
Observations. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002.
LNCS, vol. 2352, pp. 851865. Springer, Heidelberg (2002)
Fusion of Moving and Static Facial Features for Robust Face Recognition
195
3. Wiskott, L., Fellous, J., Kruger, N., Malsburg, C.: Face Recognition by Elastic Bunch
Graph Matching. IEEE Trans. on Pattern Analysis and Machine Intelligence 19, 775779
(1997)
4. Zhang, G., Huang, X., Wang, S.L.Y., Wu, X.: Boosting Local Binary Pattern (lbp)-based
Face Recognition. In: Li, S.Z., Lai, J.-H., Tan, T., Feng, G.-C., Wang, Y. (eds.)
SINOBIOMETRICS 2004. LNCS, vol. 3338, pp. 179186. Springer, Heidelberg (2004)
5. Heusch, G., Rodriguez, Y., Marcel, S.: Local Binary Patterns as An Image Preprocessing
for Face Authentication. In: IDIAP-RR 76, IDIAP (2005)
6. Kisku, D.R., Rattani, A., Grosso, E., Tistarelli, M.: Face Identification by SIFT-based
Complete Graph Topology. In: IEEE Workshop Automatic Identification Advanced
Technologies, pp. 6368 (2007)
7. Lowe, D.: Distinctive Image Features from Scale-invariant Keypoints. International
Journal of Computer Vision 60(2), 91110 (2004)
8. Smeraldi, F., Capdevielle, N., Bign, J.: Facial Features Detection by Saccadic Exploration
of the Gabor Decomposition and Support Vector Machines. In: 11th Scandinavian
Conference on Image Analysis, pp. 3944 (1999)
9. Gourier, N., James, D.H., Crowley, L.: Estimating Face Orientation from Robust Detection
of Salient Facial Structures. In: FG Net Workshop on Visual Observation of Deictic
Gestures (2004)
10. Bauer, M.: Approximation Algorithms and Decision-making in the Dempster-Shafer
Theory of EvidenceAn Empirical Study. International Journal of Approximate
Reasoning 17, 217237 (1997)
11. Samaria, F., Harter, A.: Parameterization of a Stochastic Model for Human Face
Identification. In: IEEE Workshop on Applications of Computer Vision (1994)
12. Yaghi, H., Krim, H.: Probabilistic Graph Matching by Canonical Decomposition. In: IEEE
International Conference on Image Processing, pp. 2368 2371 (2008)
13. Moghaddam, B., Pentland, A.: Face Recognition using View-based and Modular
Eigenspaces. In: SPIE Conf. on Automatic Systems for the Identification and Inspection of
Humans. SPIE, vol. 2277, p. 12 (1994)
14. Wiskott, L., Fellous, J.M., Kruger, N., von der Malsburg, C.: Face Recognition by Elastic
Bunch Graph Matching. IEEE Transactions on Pattern Analysis and Machine
Intelligence 19(7), 775779 (1997)
15. Maurer, T., von der Malsburg, C.: Linear Feature Transformations to Recognize Faces
Rotated in Depth. In: International Conference on Artificial Neural Networks, pp. 353358
(1995)
16. Lowe, D.G.: Object Recognition from Local Scale-invariant Features. In: International
Conference on Computer Vision, pp. 11501157 (1999)
17. Wilson, N.: Algorithms for Dempster-Shafer Theory. Oxford Brookes University (1999)
18. Barnett, J.A.: Computational Methods for a Mathematical Theory of Evidence. In: IJCAI,
pp. 868875 (1981)
19. Bauer, M.: Approximation Algorithms and Decision-making in the Dempster-Shafer
Theory of EvidenceAn Empirical Study. International Journal of Approximate
Reasoning 17, 217237 (1997)
20. Park, U., Pankanti, S., Jain, A.K.: Fingerprint Verification using SIFT Features. In: Vijaya
Kumar, B.V.K., Prabhakar, S., Ross, A. (eds.) Biometric Technology for Human
Identification V, SPIE, 6944:69440K-69440K-9 (2008)
21. Faggian, N., Paplinski, A., Chin, T.-J.: Face Recognition from Video using Active
Appearance Model Segmentation. In: International Conference on Pattern Recognition, pp.
287290 (2006)
196
22. Ivan, P.: Active Appearance Models for Face Recognition. Technical Report, Vrije
Universiteit of Amsterdam (2007),
http://www.few.vu.nl/~sbhulai/theses/werkstuk-ivan.pdf
23. Cootes, T.F., Taylor, C.J.: Active Shape Models. In: 3rd British Machine Vision
Conference, pp. 266275 (1992)
24. Phillips, P.J., Moon, H., Rauss, P.J., Rizvi, S.: The FERET Evaluation Methodology for
Face Recognition Algorithms. IEEE Transactions on Pattern Analysis and Machine
Intelligence 22(10), 10901104 (2000)
Introduction
In the present times, security has become a critical issue in automated authentication systems. Biometrics is a science of identifying a person using their physiological or behavioral characteristics. Biometric traits are dicult to counterfeit
and hence results in higher accuracy when compared to other methods such as
using passwords and ID cards. Human physiological and/or behavioral characteristic can be used as a biometric characteristic when it satises the requirements
like Universality, Distinctiveness, Permanence and Collectability. Moreover, one
need to focus on some major issues like Performance, Acceptability and Circumvention[1]. Keeping all these requirements in mind, biometric traits like ngerprints, hand geometry, handwritten signatures[2], retinal patterns, facial images,
ear pattern[3] etc., are used extensively in the areas which require security access.
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 197206, 2010.
c Springer-Verlag Berlin Heidelberg 2010
198
C. Hegde et al.
Related Work
A brief survey of the related work in the area of identication using ECG waves
and the signicance of Radon Transforms is presented in this section. Biel et
al.[7] showed that it is possible to identify individuals based on an ECG signal.
The initial work on heartbeat biometric recognition used a standard 12-lead electrocardiogram for recording the data. Later a single-lead ECG was being used.
Biel et al.[7] used 30 features like P wave onset, P wave duration, QRSwave
onset etc. for each person. Then the method SIMCA (Soft Independent Modeling
of Class Analogy)[8] was used to classify persons. The other approach proposed
by Shen et al.[9] uses template matching and a Decision-Based Neural Network
(DBNN). The next promising technique for human identication using ECG
was from Singh et al.[10]. In their approach, the QRS complex delineator was
implemented. All these works are for human identication but not authentication. Moreover, the previous works have used geometrical features which tend
to be error-prone, as a minute change in the features like angle might have been
ignored during approximation and/or normalization.
Considering a graph of ECG wave signals as an image is a new approach.
Swamy et al.[11] presented that the ECG wave images are more adaptable. And
image processing techniques can be applied on ECG wave image, instead of
taking ECG signals.
Radon Transform holds a distinguishable place in the eld of biometrics. It
is well known for its wide range of imaging applications. Radon transform plays
199
a key-role in the study of various biometric traits like thyroid tissue[12], face
recognition[13], gait recognition[14], iris recognition[15] etc. The Radon transform fucntion is used to detect features within a two-dimensional image. This
function can be used to return either the Radon transform, which transforms
lines through an image to points in the Radon domain, or the Radon backprojection, where each point in the Radon domain is transformed to a straight line
in the image. In this paper, we propose a technique for human authentication
by implementing Radon transform on an ECG wave image. The obtained Radon
backprojection image is then used to nd feature vector through Standardized
Euclidean pairwise distance. Then the authentication is done based on correlation coecient between two feature vectors.
Every individual in the organization, which adopts the proposed system for
authentication, will be given an identication number (ID). Having the database
of ECG wave signals of all the people in an organization consumes more space
and the complexity of the system will increase. So, we suggest to store only the
calculated features of ECG against every ID. First, we will acquire ECG waves
from a 10-second sample of every person. The conversion of ECG wave signal
format into an image will improve the adaptability and also iterative image
processing techniques can easily be applied[11]. So, the acquired ECG wave is
converted as a gray-scale image. Pre-processing techniques are applied to remove
the possible noise occured during image conversion. The pre-processed image
will undergo Radon Transform to generate Radon feature-image. The pairwise
distance between the columns of Radon image is computed and the feature
vector is generated. This feature vector will be stored in the database against
a particular ID. During authentication, the person has to provide his ID and
his ECG is captured. The feature vector is generated for new ECG wave image.
The correlation coecient between the two feature vectors is used to decide the
authenticity of that person.
The architectural diagram for the proposed algorithm is shown in Fig. 1. The
steps involved in the proposed technique are explained hereunder.
3.1
Data Acquisition
The proposed algorithm uses one-lead ECG waves. The ECG signals for a specic
time duration are captured from a person and are plotted as a graph. The plotted
graph is then converted as an image for further processing. Fig. 2 shows the ECG
wave sample for one of the subjects.
3.2
Pre-processing
The converted EGG image is in RGB format and is then converted into a grayscale image. Morphological operations like erosion and dilation are applied on
200
C. Hegde et al.
the gray-scale image to improve its intensity. Then we apply median lter, a
well-known order-statistic lter on the image. The image after pre-processing is
shown in Fig. 3.
201
3.3
Here is the perpendicular distance of a line from the origin and is the angle
formed by the distance vector. In the proposed technique we have taken to be
varying from 0o to 180o . The Radon transform is applied on the pre-processed
ECG image. The resulting Radon image is as shown in Fig. 4.
202
C. Hegde et al.
Feature Matching
For authentication purpose, we use Karl Pearsons Correlation Coecient. Correlation is a method of identifying the degree of relationship between two sets
of values. It reveals the dependency or independency between the variables. If X
and Y are two vectors of size mxn, then the Karl Pearsons correlation coecient
between X and Y is computed using the formula
m n
Y Y
i=1
j=1 X X
XY =
2 m n
2
m n
i=1
j=1 X X
i=1
j=1 Y Y
Here, X and Y respectively denote the means of vectors X and Y . The value of
correlation coecient XY may range from 1 to +1. If the value of correlation
coecient is 1, the vectors X and Y are inversely related. If the value is 0,
then the vectors are independent and if the value is +1, then the vectors are
completely (or positively or directly) related. Thus, the high degree of positive
correlation indicates that the values of vectors are very much close to each other.
So, if the correlation coecient between the feature vector of database image and
that of new image is nearer to +1, the person can be authenticated. Otherwise,
the person is rejected.
3.5
Authentication
203
3. Now, the ECG image will undergo pre-processing to remove the possible
noise and to increase the intensity.
4. Radon transform is applied on pre-processed ECG image to get a Radon
image. Then pairwise distance is calculated for this image to get a feature
vector.
5. The correlation coecient between the feature vector retrieved from the
database and newly created feature vector is computed as explained in Section 3.4.
6. If the computed correlation coecient is greater than a threshold value 0.90,
then the person can be authenticated. Otherwise he will be rejected.
4
4.1
Algorithm
Problem Definition
Algorithms
Four major functions are involved in the proposed technique. The rst function is for pre-processing the ECG image. The next function is for applying
Radon transform to get a Radon image. The third function is for computing
standardized Euclidean distance for each pair of rows in the Radon image. The
last function is for computing correlation coecient to check the authenticity of
a person. The algorithm for computing correlation coecient is given in Table 1.
204
C. Hegde et al.
Table 1. Algorithm for Calculating Correlation Coecient
//Input: The feature vectors X and Y .
//Output: The Correlation Coecient between X and Y .
begin
Intialize SumX = 0, SumY = 0, SumSqX = 0,
SumSqY = 0 and SumXY = 0
for i = 1 to rows
for j = 1 to columns
set SumX = SumX + X (i, j)
set SumY = SumY + Y (i, j)
set SumSqX = SumSqX + X (i, j)2
set SumSqY = SumSqY + Y (i, j)2
set SumXY = SumXY + X (i, j) Y (i, j)
end for
end for
AvgX = SumX/(rows cols)
AvgY = SumY /(rows cols)
EXY = SumXY /(rows cols)
StdX = sqrt(SumSqX/(row cols)AvgX 2 )
StdY = sqrt(SumSqY /(row cols)AvgY 2 )
Corr = (EXY AvgX AvgY )/(StdX StdY )
end
I1
I1
I2
I3
I4
0.9983
0.1165
0.1204
0.9106
0.2317
Accept
Reject
Reject
Reject
Reject
Accept
Reject
Reject
Accept
Reject
in having mass storage of ECG data is overcome by storing the features of Radon
image. Since the Radon transform is used with the angles varying from 0o to
180o, projection in all directions has been considered. This will improve the
performance of the system.
The results obtained for some of the subjects from Physionet QT database
is given in Table 2. The simulation results on ECG waves of 105 individuals
resulted into a confusion matrix as shown in Table 3. With the help of confusion
matrix, the false acceptance ratio is found to be 3.19% and false rejection ratio
is 0.128%. And the overall performance of the system is found to be 99.85%.
205
Actual
Genuine Non-Genuine
Genuine
91
Non-Genuine
14
10917
Tested
Conclusions
In this paper, we propose an ecient way for human authentication using ECG
images. The proposed technique uses Radon transform and standardized Euclidean distance to nd image features in an ECG image. The features are calculated for the ECG wave format of a specic time interval taken from a person
who undergoes authentication process. To infer whether the newly extracted features matches with the features stored in the database for a particular ID, we
compute Karl Pearsons correlation coecient. The proposed technique is easily
adoptable in real time situations as it is based on image processing techniques
rather than signals. The computational complexity found in previous works for
extracting geometrical features from wave signal is reduced by considering Radon
image features. The proposed algorithm uses single-lead ECG signals for imaging and authentication. Even a palm-held ECG device is sucient to acquire the
data. Hence the proposed approach is suitable for real-time application in the
organization. The proposed algorithm produced promising results with 3.19% of
FAR and 0.128% of FRR. The overall performance of the system is found to be
99.85%.
References
1. Jain, A.K., Ross, A., Prabhakar, S.: An Introduction to Biometric Recognition.
IEEE Trans. on Circuits Sys. 14(1), 420 (2004)
2. Hegde, C., Manu, S., Deepa Shenoy, P., Venugopal, K.R., Patnaik, L.M.: Secure
Authentication using Image Processing and Visual Cryptography for Banking Applications. In: Proc. Int. Conf. on Advanced Computing (ADCOM-2008), pp. 6572
(December 2008)
3. Hegde, C., Srinath, U.S., Aravind Kumar, R., Rashmi, D.R., Sathish, S., Deepa
Shenoy, P., Venugopal, K.R., Patnaik, L.M.: Ear Pattern Recognition using Centroids and Cross-Points for Robust Authentication. In: Proc. Second Int. Conf. on
Intelligent Human and Computer Interaction (IHCI 2010), pp. 378384 (2010)
206
C. Hegde et al.
4. Hegde, C., Rahul Prabhu, H., Sagar, D.S., Vishnu Prasad, K., Deepa Shenoy, P.,
Venugopal, K.R., Patnaik, L.M.: Authentication of Damaged Hand Vein Patterns
by Modularization. In: Proc. of IEEE Region Ten Conference, TENCON 2009
(2009)
5. Su, F., Khalil, I., Hu, J.: ECG-Based Authentication. In: Handbook of Information
and Communcation Security, pp. 309331. Springer, Heidelberg (2010)
6. Simon, B.P., Eswaran, C.: An ECG Classier Designed using Modied Decision
Based Neural Network. Computers and Biomedical Research 30, 257272 (1997)
7. Biel, L., Pettersson, O., Philipson, L., Wide, P.: ECG Analysis: A New Approach
in Human Identication. IEEE Trans. on Instrumentation and Measurement 50(3),
808812 (2001)
8. Esbensen, K., Schonkopf, S., Midtgaard, T.: Multivarate Analysis in Practice, 1st
edn., vol. 1 (1997)
9. Shen, T.W., Tompkins, W.J., Hu, Y.H.: One-Lead ECG for Identity Verication.
In: Proc. of Second Joint Conf. of IEEE EMBS/BMES, pp. 6263 (2002)
10. Singh, Y.N., Gupta, P.: Biometrics Method for Human Identication using Electrocardiogram. In: Tistarelli, M., Nixon, M.S. (eds.) ICB 2009. LNCS, vol. 5558,
pp. 12701279. Springer, Heidelberg (2009)
11. Swamy, P., Jayaraman, S., Girish Chandra, M.: An Improved Method for Digital Time Series Signal Generation from Scanned ECG Records. In: Int. Conf. on
Bioinformatics and Biomedical Technology (ICBBT), pp. 400403 (2010)
12. Jose, C.R.S., Fred, A.L.N.: A Biometric Identication System based on Thyroid
Tissue Echo-Morphology. In: Int. Joint Conf. on Biomedical Engineering Systems
and Technologies, pp. 186193 (2009)
13. Chen, B., Chandran, V.: Biometric Based Cryptographic Key Generation from
Faces. In: Proc. of the 9th Biennial Conf. of the Australian Pattern Recognition
Society on Digital Image Computing Techniques and Applications, pp. 394401
(2007)
14. Boulgouris, N.V., Chi, Z.X.: Gait Recognition Using Radon Transform and Linear
Discriminant Analysis. IEEE Trans. on Image Processing 16(3), 731740 (2007)
15. Ariyapreechakul, P., Covavisaruch, N.: Personal Verication and Identication via
Iris Pattern using Radon Transform. In: Proc. of First National Conf. on Computing and Information Technology, pp. 287292 (2005)
16. Laguna, P., Mark, R.G., Goldberger, A.L., Moody, G.B.: A Database for Evaluation
of Algorithms for Measurement of QT and Other Waveform Intervals in the ECG.
Computers in Cardiology, 673676 (1997)
Abstract. The dilemma of cyber communications insecurity has existed all the
times since the beginning of the network communications. The problems and
concerns of unauthorized access and hacking has existed form the time of
introduction of world wide web communication and Internets expansion for
popular use in 1990s, and has remained till present time as one of the most
important issues. The wireless network security is no exception. Serious and
continuous efforts of investigation, research and development has been going
on for the last several decades to achieve the goal of provision of 100 percent or
full proof security for all the protocols of networking architectures including the
wireless networking. Some very reliable and robust strategies have been
developed and deployed which has made network communications more and
more secure. However, the most desired goal of complete security has yet to see
the light of the day. The latest Cyber War scenario, reported in the media of
intrusion and hacking of each others defense and secret agencies between the
two super powers USA and China has further aggravated the situation. This sort
of intrusion by hackers between other countries such as India and Pakistan,
Israel and Middle East countries has also been going on and reported in the
media frequently. The paper reviews and critically examines the strategies
already in place, for wired network. Wireless Network Security and also
suggests some directions and strategies for more robust aspects to be researched
and deployed.
Keywords: Internet, Network Security, Wireless Network Security, Intrusion,
Hacking, Protocol.
1 Introduction
Wireless technology releases us from copper wires. These days a user can have a
notebook computer, PDA, Pocket PC, Tablet PC, or just a cell phone and stay online
anywhere a wireless signal is available. The basic theory behind wireless technology
is that signals can be carried by electromagnetic waves that are then transmitted to a
signal receiver. But to make two wireless devices understand each other, we need
protocols for communication.
We will discuss the current security problems with wireless networks and the
options for dealing with them. Then present methods that can be used to secure
wireless networks. However, it is important to mention the ground reality that all the
vulnerabilities that exist in a conventional wired network apply to wireless
technologies [1].
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 207219, 2010.
Springer-Verlag Berlin Heidelberg 2010
208
A. Mushtaq
It will be appropriate to discuss some vital concepts about wireless networking [2].
It is easier to understand wireless infrastructures by categorizing them into three
layers, as shown below. The three layers are device, physical and application and
service (protocol).
Table 1. Different Layers of Wireless Technologies
Layer
Application and
service
Physical
Device
Technologies
Wireless applications: WAP, i-mode, messaging,
Voice over Wireless network, VoIP, location-based
services
Wireless standards: 802.11a, 802.11b, 802.11g, AX.25,
3G, CDPD, CDMA, GSM, GPRS, radio, microwave,
laser, Bluetooth, 802.15, 802.16, IrDA
Mobile devices: PDAs, notebooks, cellular phones,
pagers, handheld
PCs, wearable computers
In the device layer (mobile devices) are gadgets ranging from the smallest cell
phone to PDAs and notebook computers. These devices use wireless technologies to
communicate with each other. The physical layer contains different physical encoding
mechanisms for wireless communications. Bluetooth, 802.11x, CDMA, GSM, and 3G
are different standards that define different methods to physically encode the data for
transmission across the airwaves. We will focus on networks built upon the 802.11x
and Bluetooth standards. The application and service layer, also referred to as ISO
layers 2 to 7, contains the protocols that enable wireless devices to process data in an
end-to-end manner. Protocols like Wireless Application Protocol (WAP), Voice over
IP (VoIP), and i-mode reside in this layer.
Many wireless networking security problems can be traced back to the end user in
wired networks. Wireless networks are no exception, and it is typically the IT
department's responsibility to protect the end user. Before an enterprise adopts the
latest wireless network technologies, it will need to:
209
their invention, which was highly limited by the availability of both sunlight and good
weather. Similar to free space optical communication, the photo-phone also required a
clear line of sight between its transmitter and its receiver. It would be several decades
before the photo-phone's principles found their first practical applications in military
communications and later in fiber-optic communications.
210
A. Mushtaq
2.
3.
4.
WPAN: As the name "personal area network" suggests, such a network is small in
the range of about 10 meters (30 feet). Infrared Data Association (IrDA) and
Bluetooth are the main WPAN wireless technologies; they exist in the physical layer.
The devices that take advantage of a WPAN include PDAs, printers, cameras, cell
phones, and access points, to name a few. The support of IrDA enables a user to
transfer data between a computer and another IrDA-enabled device for data
synchronization, file transfer, or device control. The speed for IrDA is up to 4 Mbps
per second and the distance is usually less than 30 feet in an unobstructed line of
sight.
211
Bluetooth uses radio waves to transmit data and therefore doesn't have the line-ofsight restrictions of IrDA. Bluetooth also supports higher data transmission rates
(11 Mbps) and uses the 2.4 GHz ISM bandwidth.
WLAN: The range of a wireless local area network (WLAN) is, of course, greater
than that of a WPAN. For example, most 802.11b implementations will have a speed
of 1 Mbps and a range of about 500 meters (1500 feet). With a closer proximity to the
access point (AP), speeds of up to 11 Mbps can be reached. Many systems support the
IEEE 802.11b standard; this standard uses Direct Sequence Spreading Spectrum
(DSSS) to transmit the data in the bandwidth of 2.4 GHzthe ISM band (Industrial,
Scientific and Medical band). Since this bandwidth is free for public use, other
devices such as cordless phone can cause problems and interference.
3.1 Wireless Network Topology Concerns
The Wireless Network Topology has some shortcomings which are briefly outlined
here. This will help understanding of the types of intrusions.
Current intrusion detection solutions tend to rely on the relatively static and
contained nature of wired networks. Potential 'wired' intruders would need to gain
physical access somehow, either through an accessible network jack or logically enter
the network through well-defined pathways. Locating intrusion detection sensors was
a matter of defining and inserting listeners in locations where all or most network
traffic transited. These assumptions are no longer valid for wireless networks if both
approved and rogue APs can be located anywhere on a network.
The IEEE 802.11 standard defines several types of wireless network topologies.
The Independent Basic Service Set (IBSS, or "ad hoc") topology involves two or
more wireless stations communicating peer-to-peer (Figure 2). The Basic Service Set
(BSS, or "infrastructure") topology (Figure 3), adds an AP attached to a "distribution
system" (usually a network, like Ethernet, through which all wireless communications
pass before reaching their destination.
Fig. 2.
212
A. Mushtaq
Fig. 3.
Wireless stations are all independent nodes. Each node must be responsible
for its own protection from attack and compromise. Compromising only one
node or introducing a malicious node may affect the viability of the entire
network, and an affected node could be used as a launching point for
subsequent attacks.
No central point exists from which to monitor all network traffic, as the
network is distributed.
Differences between normal and anomalous traffic patterns may be
practically indistinguishable. The mobile nature of the wireless stations can
make legitimate network traffic appear suspect.
Zhang and Lee propose an architecture in which all nodes act as independent IDS
sensor, able to act independently and cooperatively. Events are generated from a local
detection engine. If analysis of the events are inconclusive or require more
information, other networked 'local sensors can be utilized and consulted. Each
independent sensor has six modules, three of which pertain to intrusion detection:
1.
2.
3.
Data collection: the types of raw data used includes system and user activities,
local communication activities, and "observable" communications activities
Local detection: since it is difficult to maintain and distribute an anomalous
signature database, Zhang and Lee have proposed the definition of
statistically "normal" activities specific to each node, which will therefore
reside locally on each node.
Cooperative detection: if the local detection engine does not have enough
evidence to alert on a suspected problem, it can ask other nodes for
assistance. Information describing the event gets propagated to neighboring
nodes. Evidence returned from neighboring nodes can then be used to create
a new evaluation of the detected event.
213
Station (STA)
Access point (AP)
Ad hoc and infrastructure modes are illustrated in the network blueprints. The ad
hoc mode is equivalent to peer-to-peer networking. That means an ad hoc wireless
network does not have an AP to bridge the STAs together. Every STA in an ad hoc
wireless network can communicate with any other STA in the same network directly.
The infrastructure mode will have at least one AP to form a BSS. If there are
multiple APs, they will form an extended service set (ESS). All traffic from or to an
STA will go through the AP first. The AP in turn could be connected directly to
another network, such as wired intranet.
Almost every protocol set has some mechanism to protect the data, and the same is
true for IEEE 802.11b. An encryption mechanism called Wired Equivalent Privacy
(WEP) protects the data as it travels through the airwaves. Security Alert after the
WEP encryption mechanism was released, it was proved by Nikita Borisov, Ian
Goldberg, and David Wagner, in 2001 [10], [11] and [12] to be vulnerable to multiple
forms of attack. WEP [10] uses the symmetric cryptography system called RC4 with a
user-specified key (64 bits and 128 bits) to protect the data. As a result, WEP alone is
not enough to protect data.
214
A. Mushtaq
The Trace
The passive way to identify an SSID is to sniff the network traffic [13] and look for
three kinds of packets. The first one is called a beacon. An AP sends out a beacon
periodically, usually once every 100 milliseconds. With this beacon, the STA will
know there is an AP available. The beacon could contain the SSID as part of its
information. The second packet is the probe request and response, and the third
packet is the association request and response. All of these packets contain an SSID
to identify a BSS or IBSS nearby. As long as the hacker is within the proper range,
you basically cannot hide your wireless network. Some extreme methods do, of
course, exist, such as surrounding the perimeter with metal or other substances that
contain the wireless signals.
Passive Attack
This is the method of analyzing the intercepted network traffic and extracting useful
information from the collected raw information. The common tool used for this is a
sniffer such as AiroPeek. Due to the physical properties of a wireless network, one
can perform traffic capture at any location as long as the signal reaches the system. It
is known as a "parking lot attack" or "war driving", these methods illustrate that
hackers can perform traffic analysis in a car by either parking the target building or
just driving the surrounding streets.
Clear Text Traffic
Probably the best scenario for a hacker and the worst for the system administrator is
clear text traffic. If there is no protection on the data being transmitted over wireless,
then an attacker can easily sniff the traffic and perform protocol or data analysis later
to crack into information gold mine: credit card information, passwords, and personal
emails. If the data is not protected, then the odds are high that the rest of the network
setup is also insecure.
5.1 Problems with WEP
If the wireless network has WEP enabled, the hacker's game is still not over. The
following problems exist in the WEP algorithm [14], and can potentially be exploited.
Brute Force Attack
WEP makes use of a symmetric cryptography system called RC4. The user can use a
shared secret key. The real key to encrypt the data with RC4 algorithm is generated
by a pseudo random number generator (PRNG). But flaws in PRNG can cause the
real key space to be less than 64 bits or 128 bits. The flaw actually reduces the key
space of the 64-bit key to 22 bits. Therefore, it is possible for an attacker to collect
enough information to try to discover the key offline.
Duplicate IV
Initiation vector (IV) is a 3-byte random number generated by the computer. It is
combined with a key chosen by the user to generate the final key for WEP
encryption/decryption. The IV is transmitted with the encrypted data in the packet
without any protection so that the receiving end knows how to decrypt the traffic.
215
When a wireless network is using the same user-chosen key and duplicate IV on
multiple packets, it might be in trouble. The hacker would know that all those packets
with the same IV are being encrypted with the same key, and can then build a
dictionary based on the packets collected. By knowing that the RC4 cryptography
system uses the XOR algorithm to encrypt the plaintext (user data) with the key, the
hacker can find the possible value of the packets. The hacker can do this because the
XOR result of two different plaintext values is the same as the XOR result of two
cipher-text-encrypted values with the same key. If the hacker can guess one of the
plaintext packets, then they can decrypt the other packet encrypted with the same key.
Chosen/Known Plaintext
On the other hand, if a hacker knows the plaintext and cipher-text of a WEP-protected
packet, he can determine the encryption key for that packet. There are several
methods of determining the key, including sending an email or generating ICMPInternet Control Message Protocol- echo request traffic (ping). A hacker who knows
the corporate intranet pretty well could send in an email to the network. Knowing the
contents of that email, the hacker could then capture the traffic on the wireless side,
identify the packets related to the email, and find out the key, eventually building a
dictionary for a real-time decryption attack.
Weakness in Key Generation
A weakness in the random key generator of the RC4 algorithm [15] used in WEP can
permit a hacker to collect enough packets with IVs that match certain patterns to
recover the user-chosen key from the IVs. Generally, a hacker would need to sniff for
millions of packets to get enough "interesting" IVs to recover the key, so it could take
days, if not weeks, to crack a moderately used wireless network. This is just a very
basic attack, with several tools available to do the sniffing and decoding for hacker.
Airsnort is the famous one: it runs on Linux and tries to break the key when enough
useful packets are collected.
Bit-Manipulation Attack
WEP doesn't protect the integrity of the encrypted data. The RC4 cryptography
system performs the XOR operation bit by bit, making WEP-protected packets
vulnerable to bit-manipulation attack. This attack requires modification of any single
bit of the traffic to disrupt the communication or cause other problems.
Authentication and Authorization
Once the hacker knows information such as the SSID of the network, MAC addresses
on the network, and maybe even the WEP key, he can try to establish an association
with the AP. There are currently three ways to authenticate users before they can
establish an association with the wireless network.
Open Authentication
Open authentication usually means user only need to provide the SSID or use the
correct WEP key for the AP. It can be used with other authentication methods, for
example, using MAC address authentication. The problem with open authentication is
216
A. Mushtaq
that if a user doesn't have other protection or authentication mechanisms in place, then
the wireless network is totally open, as the name indicates.
Shared Secret Authentication
The shared secret authentication mechanism is similar to a challenge-response
authentication system. It is used when the STA shares the same WEP key with the
AP. The STA sends the request to the AP, and the AP sends back the challenge. Then
the STA replies with the challenge and encrypted response. The insecurity here is that
the challenge is transmitted in clear text to the STA, so if a hacker captures both
challenge and response, then he could figure out the key used to encrypt it.
5.2 Active Attacks and Denial of Service
Most of the security problems briefly described above related to passive attacks;
however, there are also active attacks and denial of service [16]. One of the most
interesting attacks is to set up a fake access point and let valid users establish
associations, then collect information or perform a MITM (Man In The Middle)
attack. In a corporate environment, if a malicious employee installs a rogue AP on the
corporate intranet, then its like creating a security holesomeone in the parking lot
can easily hook on and surf the intranet. Another known hacking is to try and steal the
key for WEP from a wireless network user's laptop. A tool called Lucent Orinoco
Registry Encryption/Decryption [17] can break the encryption and extract the WEP
key for Lucent Orinoco card users from the registry.
If an attacker tried all the above methods and failed, then the final choice might be
denial of service attacks. Such attacks might bring down the wireless network and
disrupt the service. We will describe the attacks from two different levels: physical
level and protocol level.
Physical Level
There are two ways to disrupt the service of a wireless network physically]:
Protocol Level
A hacker can disrupt service from the protocol level. If hacker can build an
association, then there must be a way to disassociate. If one can authenticate, then
there must be a way to unauthenticated. Unfortunately, in the IEEE 802.11b standard,
both methods exist, and both methods do not require any authentication in the
message. That means the hacker can send out a disassociated or unauthenticated
217
message to an arbitrary wireless network user and disconnect them. This is a bad
design from the protocol's perspective.
From the Wired Side
It should not be assumed that just because a hacker cannot get access to the wireless
network that there is no other ways to attack it. Suppose one has a wireless network
connected to the intranet and configured only for a few users. Another employee in
the company discovers the AP from the internal network and accesses the
management interfaces on the AP. He breaks in through a default SNMP community
string and changes the configuration, so he is now part of the list of allowed users.
This situation points out a problem: every AP has its own management interface(s).
The most commonly used management interfaces are Telnet, FTP, web, and SNMP.
If any of these management interfaces are not secured, there is always a chance that
someone can take advantage of the default setup.
Change the SSID: A wireless STA uses the SSID to identify the wireless
network. There is currently no way to hide that from a potential hacker. The
only thing one can do is change the SSID so it doesn't make immediate
association to your company. For example, if you work for MIT, do not
name the SSID "MIT_GM89." This technique is more obfuscation than
protection.
Configure the AP correctly: Make sure to change the default SSID, default
password, and SNMP community string, and close down all the management
interfaces properly.
Do not depend on WEP: Use IPSec, VPN, SSH, or other substitutions for
WEP. Do not use WEP alone to protect the data. It can be broken!
Adopt another authentication/authorization mechanism: Use 802.1x,
VPN, or certificates to authenticate and authorize wireless network users.
Using client certificates can make it nearly impossible for an hacker to gain
access.
218
A. Mushtaq
7 Conclusion
The paper briefly describes foundations and vital aspects of wired Networking
(Internet), in general, and Wireless Networking, in particular, to highlight the
vulnerabilities and industry concerns of insecurity of Wireless Network. The paper
briefly describes wireless communication modes, the wireless network security
problems currently found in different infrastructures and wireless network insecurity
features. It also suggests the ways and means which can be effectively applied to
reduce the chances of hacking and make wireless network more safe and secure.
However, the R&D industry experts of wireless communication, professional and
academic researcher has not succeeded in finding out a 100 percent safe and secure
strategy for adoption.
This does not mean that the human efforts and resources being spent to achieve the
goal may be abandoned. The romance of R&D will continue like many other
scientific and engineering fields. The end might not be insight yet, but the industry is
approaching towards the goal.
References
1. Karygiannis, T., Owens, L.: Wireless Network Security: 802.11, Bluetooth and Handheld
Devices, National Institute of Standards and Technology Gaithersburg, MD 20899-8930
(November 2002)
2. http://www.wirelesstutorials.info/wireless_networking.html
3. http://en.wikipedia.org/wiki/Photophone#World.27s_first_
wireless_telephone_communication_E2.80.93_April_1880
4. Stallings, W.: Wireless Communications and Networks. Prentice Hall, Englewood Cliffs
(August 2001)
5. Bidgoli, H.: The Handbook of Information Security. John Wiley & Sons, Inc., Chichester
(2005)
6. Wireless networks have had a significant impact on the world as far back as World War II.
Through the use of wireless networks,
http://wikipedia.org/wiki/Wireless_network
7. Wi-Fi Protected Access (WPA and WPA2) is a certification program developed by the
Wi-Fi Alliance to indicate compliance with the security protocol,
http://wikipedia.org/wiki/Wi-Fi_Protected_Access
219
8. Brenner, P.: A Technical Tutorial on the IEEE 802-11 Protocol, Director of Engineering,
BreezCom
9. Zhang, Y., Wenke, L.: Intrusion Detection in Wireless Ad-Hoc Networks. In: Proceedings
of the Sixth Annual International Conference on Mobile Computing and Networking
(2000); Arbaugh, W.A., Shankar, N., Justin Wan, Y.C.: Your 802.11 Wireless Network
has No Clothes. Department of Computer Science, University of Maryland, March 31
(2001)
10. Borisov, N., Goldberg, I., Wagner, D.: Intercepting Mobile Communiccation Conference
on Mobile Computing and Networking (2001),
http://www.springerlink.com/index/cjut5dxd8r9tvrpe.pdf
11. Borisov, N., Goldberg, I., Wagner, D., Berkeley, U.C., Cox, J.: LAN Services Set to Go
Wireless, Network World, August 20 (2001); IEEE Working Group for WLAN Standards,
http://grouper.ieee.org/groups/802/11/index.html
12. Borisov, N.: Deploying Wireless LANs and Voice & Data. McGraw-Hill, New York
(2001), http://doi.ieeecomputersociety.org/10.1109/6294.977772
13. Bradley, T.: CISSP-ISSAP, Introduction to Packet Sniffing, former About.com Guide
14. Borisov, N., Goldberg, I., Wagnert, D.: Security of the WEP algorithm,
wep@isaac.cs.berkeley.edu
15. Paul, S., Preneel, B.: A new weakness in the RC4 keystream generator and an approach to
improve the security of the cipher. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS,
vol. 3017, pp. 245259. Springer, Heidelberg (2004)
16. Khan, S., et al.: Denial of Service Attacks and Challenges in Broadband Wireless
Networks. IJCSNS International Journal of Computer Science and Network Security 8(7)
(July 2008)
17. Ingeborn, A., Ingeborn: Lucent Orinoco Registry Encryption/Decryption,
http://www.cqure.net/tools03.html
18. Tzu, S., Zi, S.: The Art of War. Dover Publications, New Paperback, 96 pages (2002)
ISBN 0486425576
Abstract. Sensor nodes are of low-cost and hence vulnerable to attacks. The
adversaries need not endure hardship to control these nodes. Malicious nodes or
compromised nodes are difficult to detect. These malicious nodes drain the energy available, give false readings, project itself to be as one of the routers
hence attracting all the packets leading to denial of service attack.
This paper is trying to test the nodes for its fidelity by considering one of the
normal procedures to initiate action among the normal nodes. The paper utilizes
public key cryptographic methods and TDMA technology to accomplish the
task.
Keywords: security, authentication, wireless sensor network, LEACH.
1 Introduction
Wireless sensor networks is one of the fields which has emerged during these years in
spite of many hindrances. Self-organizing, rapid deployment and fault tolerance are
some of its characteristics. Wireless sensor network is being utilized in many areas [1]
like habitat monitoring, monitoring environmental conditions that affect the crops and
livestock, irrigation, macro instruments for large-scale earth monitoring, planetary
exploration, chemical/biological detection, battlefield surveillance, targeting, battle
damage assessment and so on. These sensor network can be utilized to sense movement, pressure, humidity and so on. Sensor network is deployed in harsh environment, work under high pressure at the bottom of the ocean and so on.
Security is one of the critical issues in network. There are many types of security
breaches found in these types of networks. Broadly classifying types of security
breaches two major types are available namely internal attackers and external attackers. Internal attackers are the ones where in the nodes would have been compromised
by the advertisers. This is a detective security solution that can make the network secure. This paper is proposing a better approach to identify the compromised nodes in
the network.
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 220223, 2010.
Springer-Verlag Berlin Heidelberg 2010
221
2 Related Work
LEACH [2] is one of the algorithms that uses cluster based algorithms. After the deployment of nodes, they send HELLO messages to recognize their neighborhood
nodes. The node that sends the message with higher signal strength is chosen as cluster head. The nodes within certain radius assemble to form the cluster. The nodes send
the sensed data to their respective cluster head which is further aggregated and forwarded to the base station. The nodes rotate the cluster head position depending on
the remaining energy in these nodes.
HEED [3] is a cluster based algorithm used in ad-hoc networks. The cluster head
depending on many factors like residual energy in the nodes, their proximity to their
neighbors, node degree.
SLEECH [4] utilizes leach concept with an additional feature to withstand the adversaries or the compromised nodes. The algorithm utilizes one-way hash chains and
symmetric cryptographic operations to accomplish the purpose.
3 System Model
3.1 Deployment of Nodes and Distribution of Public Key
This paper utilizes public key cryptographic methods [5][6]. The private keys utilized to decrypt the message are embedded before the deployment of the nodes. The
public key utilized to encrypt the message is broadcasted by the base station at random intervals.
3.2 Cluster Formation
After the deployment of the nodes, the clusters are formed choosing one of the nodes
as its cluster head as in [2]. The cluster configuration time is estimated, a copy of it is
sent to the base station and stored in the other nodes forming the cluster including the
cluster head. Along with the configuration time it also sends the unique ID of all the
nodes in the cluster. The time required to travel from the cluster head to the base station is estimated and entered in the database of the base station.
3.3 Setting Up TDMA for Transmission
TDMA is one of the channel access methods which can be utilized to avoid collision
among the transmissions. This access method is utilized in all the rounds when the
cluster head is chosen. All the cluster head are assigned a slot for its transmission. An
additional slot to broadcast public key to the new nodes is also scheduled . This slot is
scheduled randomly.
3.4 Report Generation
The broadcasting of the public key is chosen randomly and kept as a secret. When this is
being broadcasted, the compromised node will have an assumption that the broadcast is
222
being done for the newly deployed nodes. But those uncompromised nodes will get a
hidden information to transmit additional information. The cluster head has to send the
configuration time to the base station. This information is checked against the stored
data. If it does not arrive in the estimated time or if it gives an improper information, the
base station broadcasts the information to other nodes. The nodes then need to choose
another cluster head. If the nodes do not get any such information it starts its own job of
transmission.
The nodes send their sensed data along with configuration time. The cluster head
has to cross check the information obtained by the nodes. If the configuration time
sent by the nodes does not match what is being stored in itself, it has to send a report
to the base station. This hence will help to identify the compromised node among the
cluster.
4 Simulation Results
The LEACH scheme does not enclose any security methods. FAS is a modified
LEACH scheme which not only provides confidentiality and integrity to the network
but also makes a check on compromised nodes. Applying this scheme increases the
energy consumption to 18.5% more than the LEACH scheme.
5 Conclusion
Utilizing a detective mechanism in a network, keeps a check on compromised nodes.
This not only identifies the compromised nodes but also hinders the wastage of resources, will provide accurate information of the sensed data to the base station. This
paper is an attempt to securely and accurately transfer the data.
223
References
1. Akyildiz, E., Su, W., Sanbrrsubrarnaoiam, Y., Cuyirci, E.: A survey OD SCOSUT network.
IEEE Commmicariom Mngeirre 40(8), 102114 (2002)
2. Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H.: An application specific protocol
architecture for wireless microsensor networks. IEEE Transactions on Wireless Communications 1(4), 660667 (2002)
3. Younis, O., Fahmy, S.: Distributed clustering in ad-hoc sensor networks: A hybrid, energyefficient approach. In: Proc. 13th Joint Conf. on IEEE Computer and Communications
Societies (March 2004)
4. Xiao-yun, W., Li-zhen, Y., Ke-fei, C.: SLEACH: Secure low energy adaptive clustering hierarchy protocol for WSNs. Wuhan University Journal of Natural Sciences 10(1), 127131
(2005)
5. Damodaran, D., Singh, R., Le, P.D.: Group Key Management in Wireless Networks Using
Session Keys. Presented at Third International Conference on Information Technology:
New Generations, pp. 402407 (2006)
6. Myles, A., Johnson, D.B., Perkins, C.: A Mobile Host Protocol Supporting Route Optimization and Authentication. IEEE Journal on Selected Areas in Communications 13(5) (June
1995)
7. Boyle, D., Newe, T.: The Impact of Java and Public Key Cryptography in Wireless Sensor
Networking. In: The Fourth International Conference on Wireless and Mobile Communications (2008)
1
Center of Excellence in Information Assurance, King Saud University, KSA
Information Systems Department, College of Computer and Information Sciences,
King Saud University, KSA
3
International Islamic University, Islamabad, Pakistan
Abstract. Presence is a service that provides dynamic up-to-date status information of other users. Since presence is highly dynamic so it puts heavy load on
the core network as well as on air interface of the access network. Security is
another issue that needs to be address carefully because presence service is subject to various types of security threats. We, in this paper, conducted a detailed
survey on issues of security and network load in presence. At the end we discussed few open issues related to security of presence service.
Keywords: Presence, Flooding Attacks, Network Load.
1 Introduction
Presence allows the user to share their live and dynamic status with each other. The
status may contain information like current location of the user, the available devices,
preferred means of communication, currently supported applications etc. Presence
service contributes to change the current communication paradigm. You have information about a particular person before contacting him /her. The components of the
presence service include Personal User Agent or Publisher, Watcher and Presence
Server. The publisher provides the information to presence server who stores it and
provides it to the subscribed watchers. Subscription can be made to more then one
presentities at a time known as subscription to presentity list. Presentity can set different information for different watchers at different levels [1]. Figure 1 explains the
subscription of presence service while the figure 2 describes the publication of presence service.
225
network. Authors proposed that instead of subscribing presence information for short
period of time, a watcher should subscribe the presence with an intermediate server
for a very long period of time. The intermediate server is responsible for subscription
of presence at presence server for short period of time. Since this intermediate server
is proposed inside the core network so it reduces the load on the air interface [3].
Rishi et al. in 2005 analyzed the effect of presence service over network. PULL
and PUSH based communication mechanisms for presence service are discussed and
compared. The authors pointed out that since the presence service is not point to point
226
service so it adds a significant load of traffic over the network. Every change in the
presence information of a presentity is communicated to all of its subscribers. Privacy
issues are also discussed by the authors [4]. Alam et al. in 2005 analyzed the cost of
presence service. According to them, it is essential to reduce the traffic load in order
to make the presence service more attractive. In the paper they provided an analytical
framework to measure the performance of the IMS presence service [5].
Yang in 2006 presented distributed presence service middleware architecture to
cope with the problems like service provisioning, Quality of Service (QoS), bandwidth provisioning [6]. Florian et al. in 2006 described that instead of subscribing to
presence service individually, subscription should be allowed to a Resource List
Server (RLS). RLS collects the information and sends it in bundles. It reduced the
number of messages thus results in efficient utilization of resources. Author also proposed that instead of providing information to subscriber after every change, the RLS
should collect the information and provide it to the watcher only on demand [7].
Pailer et al. in 2006 proposed architecture for location service enablers in IMS. They
proposed that trigger information and notification should be processed in the terminal.
They argued that their solution is more efficient as compared to other solutions [8].
Nisha et al. in 2006 analyzed the IMS network by considering SIP delay as main
parameter. Authors formulate a queuing model for IMS and studied work load on SIP
server [9]. Alam et al. in 2007 proposed Weighted Class Based Queuing (WCBQ) in
order to reduce the load at presence server. According to the authors a watcher who is
subscribed for a list of 100 presentities will receive a notification message after every
change in any of the 100 presentities. So it will result in consumption of resources at
client side that is equipped with low processing devices. The WCBQ drop the low
priority pre existing messages in order to reduce the load. The results showed that
during the heavy traffic load this mechanism works well. The graphs showed that
WCBQ only works well when there is a scenario of heavy traffic load [10].
Salinas in 2007 described the advantages and disadvantages of using presence service. Author described that on one side presence service facilitates many other services, makes the communication easy, reduces the unnecessary traffic etc. and on the
other side it has privacy concerns. An intelligent user can guess the routine of other
user by seeing the presence history. According to the author presence service also
involves the end user so it requires that the end user must be aware that how to use
presence service [11]. Sedlar et al. in 2007 proposed the use of presence information
in an enterprise environment. According to authors if the presence information is collected from different sources and provided to the subscribers after aggregation then it
can help the employees to organize themselves more efficiently [12]. Loreto et al. in
2008 uses the idea of presence network agent to improve the performance of the presence service by minimizing the load from radio access network. The author discussed
few open issues those need to be resolved in order to improve the performance of the
presence service [13].
Mckeon et al. in 2008 studies the effect of presence service over the latency and
throughput of the network. The authors analyzed that the presence service can put a
great load over the network by issuing too much traffic [14]. Beltern et al. in 2008 presented the fully distributed platform to deploy presence service. They proposed middleware architecture consists of two layers. First layers takes the intelligent decision to
process and manage the presence information and the second layer is responsible for
227
sending and receiving messages like subscribe and notify. The major emphasizes of the
authors is on the management of the presence information in order to make it more
efficient. RA rules defined by the authors restrict a user to communicate with other
user [15].
Chen et al. in 2009 argued that presence notifications put an extra load on network
as well as on watcher. Therefore to reduce the load of presence notifications they
worked on introducing a new notification method called as weakly consistent scheme.
In this scheme notifications are delayed up to a specific period of time and results in
reducing network load [16]. Paolo et al. in 2009 worked on enhancing the location
based services. Among the location based services authors focused on presence service. According to authors, implementation of presence service requires two main
issues to be resolved. Firstly they focused on load balancing and secondly automatic
activation and de-activation of presence service is considered [17]. Paolo et al. in
2009 argued that the major issue in success of presence service is the scalability.
Since presence is a dynamic continuous service so due to heavy load a question on
scalability arises. To solve this issue authors proposed three extensions to the presence service. First they optimized the inter-domain distribution of notify messages,
secondly they proposed a framework for differentiated quality and thirdly client side
buffering is proposed [18].
228
Sher et. al. in 2007 developed an Intrusion Detection and Prevention (IDP) system
to secure the IMS application server. This IDP system compares all the incoming and
outgoing requests and responses with the defined rules and decides whether to forward it or not. [24]. Sher et.al in 2007 describes a security model to secure the IMS
application layer from time independent attacks like SQL injection. To attain this purpose author developed an intrusion detection and prevention system. Transport layer
security is also provided in the paper [25]. Rebahi et.al in 2008 described that IMS is
subject to various types of denial of service attack. In order to make the IMS successful author emphasize to secure it from these attacks. Solutions to mitigate the denial
of service attack are presented in the paper [26].
229
References
[1] Poikselka, M., Mayer, G., Khartabil, H., Niemi, A.: IP multimedia concepts and services,
2nd edn. John Wiley & Sons Ltd., Chichester (2006)
[2] Pous, M., Pesch, D., Foster, G., Sesmun, A.: Performance Evaluation of a SIP Based
Presence and Instant Messaging Service for UMTS. In: 4th International Conference on
3G Mobile Communication Technologies, 3G 2003 (2003)
[3] Miladinovic, I.: Presence and Event Notification in UMTS IP Multimedia Subsystem. In:
Fifth IEE International Conference on 3G Mobile Communication Technologies (3G
2004), The Premier Technical Conference for 3G and Beyond (CP503) London, UK, October 18-20 (2004)
[4] Rishi, L., Kumar, S.: Presence and its effect on network. In: IEEE International Conference on Personal Wireless Communications (2005)
[5] Alam, M.T., da Wu, Z.: Cost Analysis of the IMS Presence Service. In: 1st Australian
Confrence on Wireless Broadband and Ultra Wideband Communication (2006)
[6] Yang, S.B., Choi, S.G., Ban, S.Y., Kim, Y.-J., Choi, J.K.: Presence service middleware
architecture for NGN. In: The 8th International Conference on Advanced Communication
Technology (2006)
[7] Wegscheider, F.: Minimizing unnecessary notification traffic in the IMS presence system.
In: 1st International Symposium on Wireless Pervasive Computing (2006)
[8] Pailer, R., Wegscheider, F., Bessler, S.: A Terminal-Based Location Service Enabler for
the IP Multimedia Subsystem. In: Proceedings of WCNC (2006)
[9] Rajagopal, N., Devetsikiotis, M.: Modeling and Optimization for the Design of IMS
Networks. In: Proceedings of the 39th Annual Simulation Symposium (2006)
[10] Alam, M.T., da Wu, Z.: Admission control approaches in the IMS presence service. International Journal of Computer Science 1(4) (2007)
[11] Salinas, A.: Advantages and Disadvantages of Using Presence Service,
http://www.tml.tkk.fi/Publications/C/21/salinas_ready.pdf
[12] Sedlar, U., Bodnaruk, D., Zebec, L., Kos, A.: Using aggregated presence information in
an enterprise environment,
https://www.icin.biz/files/programmes/Poster-7.pdf
[13] Loreto, S., Eriksson, G.A.: Presence Network Agent: A Simple Way to Improve the Presence Service. IEEE Communications Magazine (2008)
[14] McKeon, F.: A study of SIP based instant messaging focusing on the effects of network
traffic generated due to presence. In: IEEE International Symposium on Consumer Electronics (2008)
[15] Beltran, V., Paradells, J.: Middleware-Based Solution to Offer Mobile Presence Services.
In: Mobilware 2008 (2008)
[16] Chen, W.-E., Lin, Y.-B., Liou, R.-H.: A weakly consistent scheme for IMS presence service. IEEE Transactions on Wireless Communications 8(7) (2009)
[17] Bellavista, P., Corradi, A., Foschini, L.: IMS-based presence service with enhanced scalability and guaranteed QoS for interdomain enterprise mobility. IEEE Wireless Communications 16(3) (2009)
[18] Bellavista, P., Corradi, A., Foschini, L.: Enhancing the Scalability of IMS-Based Presence Service for LBS Applications. In: Proceedings of the 2009 33rd Annual IEEE International Computer Software and Applications (2009)
230
[19] Magedanz, T., Witaszek, D., Knuettel, K.: The IMS Playground @ FOKUS An Open
Testbed for Next Generation Network Multimedia Services. In: Proceedings of the First
International Conference on Testbeds and Research Infrastructures for the DEvelopment
of NeTworks and COMmunities (TRIDENTCOM 2005) (2005)
[20] Singh, V.K., Schulzrinne, H.: A Survey of Security Issues and Solutions in Presence,
http://www1.cs.columbia.edu/~vs2140/presence/
presencesecurity.pdf
[21] Sher, M., Wu, S., Magedanz, T.: Security Threats and Solutions for Application Server of
IP Multimedia Subsystem (IMS-AS). In: IEEE/IST Workshop on Monitoring, Attack Detection and Mitigation (2006)
[22] Sher, M., Magedanz, T., Penzhorn, W.T.: Inter-domains security management (IDSM)
model for IP multimedia subsystem (IMS). In: The First International Conference on
Availability, Reliability and Security (2006)
[23] Rosenberg, J.: A Presence Event Package for the Session Initiation Protocol (SIP). Request for Comments: 3856 (2004)
[24] Sher, M., Magedanz, T.: Developing Intrusion Detection and Prevention (IDP) System
for IP Multimedia Subsystem (IMS) Application Servers (AS). Journal of Information
Assurance and Security (2007)
[25] Sher, M., Magedanz, T.: Protecting IP Multimedia Subsystem (IMS) Service Delivery
Platform from Time Independent Attacks. In: Third International Symposium on Information Assurance and Security (2007)
[26] Rebahi, Y., Sher, M., Magedanz, T.: Detecting flooding attacks against IP Multimedia
Subsystem (IMS) networks. In: IEEE/ACS International Conference on Computer Systems and Applications (2008)
1 Introduction
Botnet [1] [2] is a collection of compromised computers called bots. Bots just obey
the instructions issued by the controller [3]. Botnet provides the disturbed platform for
malicious activities (Denial of service attack [4], spam emails [5], phishing, steal
information etc) [6]. They are three main areas of research in Botnet i.e. understand
the concept of Botnet, detection and countermeasures [7]. This paper deals with detection of Botnet.
Mainly they are three ways for the detection of Botnet i.e. signature based, anomaly
based and DNS based detections. Signature based detection can be used to detect only
know Botnet where as anomaly based detection can be used detect known as well as
unknown Botnets. DNS based detection is also helpful as bot used DNS to find the
address of bot master, so DNS query can be used to find the malicious node[8].
We in this paper analyze the different Botnet detection techniques i.e. SlingBot,
BotGAD, BotMiner, SBotMiner, BotSniffer, AutoRE etc. We also mention the characteristics, functionality and comparative analysis of Botnet detection techniques.
This paper is organized as follows: In section 2, we discuss the existing detection
mechanism of Botnet and their functionality. In section 3, we describe the comparative analysis of existing techniques on the basis of DNS query, History data and group
activity. Lastly in section 4 conclusions is given.
2 Detection Mechanisms
Botnet Detection is not an easy task. We can detect Botnet only when they communicate at large scale. Mainly two methods exist for Botnet detection i.e. active detection
and passive detection. [9]
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 231235, 2010.
Springer-Verlag Berlin Heidelberg 2010
232
2.1 SLINGbot
SLINGbot (System for live Investigation of Next Generation bots) [10] is a proactive
approach to detect the current and future Botnets. Existing techniques focuses on
current Botnets and ignore the future threats. So they are not suitable for Botnets
detection because of the evolving nature of Botnets.
Python language is used to implement the SLINGbot as it support the platform independence. SLINGbot is composed of two fundamental parts i.e. Botnet Scenario
Driver, which helps to manage the Botnet experiments and Composable Bot Framework provides the establishing and maintaining connectivity to Botnet and routing
information.
2.2 BotGAD
BotGAD (Botnet Group Activity Detector) [11] detect know and unknown Botnets on
the basis of group activity in large scale networks. DNS queries are useful to capture
malicious nodes as bots in Botnet normally use DNS to search their master. [Criminology] BotGAD is implemented with help of DNS traffic and its performance is
measured using real time network traces.
2.3 BotMiner
Gu et al proposed a framework [12] for Botnet detection that is independent of communication protocols, structure and history knowledge of Botnet. It captures identical
communication and malicious traffic, and performs cross cluster correlation to determine the bots that distribute communication and malicious activity patterns. BotMiner
is implemented in real scenario and produce low false rate.
2.4 SBotMiner
High rate traffic generated bots are easily detected through existing threshold techniques where as no method exists to capture low rate traffic generated bots.
Yu et al [13] proposed a system called SBotMiner to identify low rate traffic generated bots. SBotMiner focus on identifying group of bots rather than capturing individual bots. SBotMiner is mainly consisting of two fundamental steps i.e. to identify
the group activity that is different from history and by using Matrix based scheme to
differentiate between human traffic and Botnet generated traffic. Bots within the same
Botnet perform the same malicious activities by running the script issued from the
master where as human traffic contains diversity.
2.5 Bayesian Bot Detection
Botnet detection can be done by analyzing the communication between the controller
and bots. Similar actions are performed by the bots within the same Botnet. Ricardo et
al [14] proposed Bayesian approach for the detection of Botnet. So its approach just
analyze the current DNS traffic with existing know bot traffic.
233
2.6 BotSniffer
Gu et al proposed system [15] to detect Botnet with the help of anomaly detection and
require no history data of Botnet. It uses statistical methods to capture Botnet and
identify the bots and controller communication and their malicious activities.
2.7 AutoRE
Yinglian et al [16] proposed AutoRE framework for the detection of spam emails and
Botnet membership. AutoRE does not require history data and automatically generate
URL signature to detect Botnet spam with low error rate. Current and future Botnet
spam emails are detected with the help of these signatures.
2.8 Automatic Discovery
Lu et al proposed hierarchical framework for automatically detection of botnets at
large-scale and categorize the network flow with the help of clustering algorithm and
payload signature [17]. Core idea of this framework is to categorize the network
traffic into network application and differentiate the malicious traffic from normal
flow.
2.9 P2P Botnet Detection
Detection of centralized Botnet is some how easy as compare to P2P Botnet. Su et al
proposed P2P Botnet detection [18] with help of behavior clustering and statistical
test. Proposed scheme produce low false rate in simple and realistic scenarios.
2.10 Data Mining Approach
Muhammad et al proposed a technique to detect P2P Botnet [19] using data mining
approach. Stream data classification can be used for the detection of Botnet but
the proposed technique is more accurate than existing stream data classification
schemes.
2.11 BotTracer
They are three main characteristics of botnet i.e. automatically startup (do not rely on
user action), establish connection with bot controller and perform attack soon. Lei
proposed BotTracer [20] to detect the above characteristics of botnet.
234
Techniques
SLINGbot
BotGAD
BotMiner
SBotMiner
Bayesian
BotSniffer
AutoRE
Auto Discovery
P2P Detection
Data Mining
BotTracer
Active
x
x
x
9
x
x
9
9
x
x
9
Proactive
9
9
9
x
9
9
x
x
9
9
x
4 Conclusion
Botnet provides the distributed platform for malicious activities (Denial of service
attack, spam emails, phishing, steal information etc). In this paper we discuss different approaches for Botnet detection. These approaches are mainly used DNS, History
data and group activity for detection. So we present the comparative analysis of Botnet techniques on the basis of theses factors.
Acknowledgments
This research was supported by the Prince Muqrin Chair (PMC) for IT Security at
King Saud University, Riyadh, Saudi Arabia.
References
1. Bailey, M., Cooke, B., Jahanian, F., Xu, Y.: A Survey of Botnet Technology and Defenses.
In: Cybersecurity Applications & Technology Conference for Homeland Security. IEEE,
Los Alamitos (2009)
2. Stone-Gross, B., Cova, M., Cavallaro, L., Gilbert, B., Szydlowski, M.: Your Botnet is My
Botnet Analysis of a Botnet Takeover. ACM, New York (2009)
3. Leonard, J., Xu, S., Sandhu, R.: A Framework for Understanding Botnets. In: International
Conference on Availability, Reliability and Security. IEEE, Los Alamitos (2009)
4. Collins, M.P., Shimeall, T.J., Kadane, J.B.: Using Uncleanliness to Predict Future Botnet
Addresses. In: IMC 2007. ACM, New York (2007)
5. Pathak, A., Qian, F., Hu, C., Mao, M., Ranjan, S.: Botnet Spam Campaigns Can Be Long
Lasting Evidence, Implications, and Analysis. ACM, New York (2009)
6. Zhu, Z., Lu, G.: Botnet Research Survey. In: Annual IEEE International Computer Software and Applications Conference. IEEE, Los Alamitos (2008)
235
7. Feily, M., Shahrestani, A.: A Survey of Botnet and Botnet Detection. In: Third International Conference on Emerging Security Information, Systems and Technologies. IEEE,
Los Alamitos (2009)
8. Li, C., Jiang, W., Zou, X.: Botnet: Survey and Case Study. In: Fourth International Conference on Innovative Computing, Information and Control. IEEE, Los Alamitos (2009)
9. Govil, J., Govil, J.: Criminology of BotNets and their Detection and Defense Methods.
IEEE, Los Alamitos (2007)
10. Jackson, A.W., Lapsley, D., Jones, C., Zatko, M., Golubitsky, C., Strayer, W.T.: SLINGbot A System for Live Investigation of Next Generation Botnets. In: Cybersecurity Applications & Technology Conference For Homeland Security. IEEE, Los Alamitos (2009)
11. Choi, H., Lee, H., Kim, H.: BotGAD: Detecting Botnets by Capturing Group Activities in
Network Traffic. In: COMSWARE, Dublin, Ireland (2009)
12. Gu, G., Perdisci, R., Zhang, J., Lee, W.: BotMiner: Clustering Analysis of Network Traffic
for Protocol- and Structure-Independent Botnet Detection. In: 17th USENIX Security
Symposium (2008)
13. Yu, F., Xie, Y., Ke, Q.: SBotMiner: Large Scale Search Bot Detection. In: WSDM, February 4-6. ACM, USA (2010)
14. Villamarn-Salomn, R., Brustoloni, J.: Bayesian Bot Detection Based on DNS Traffic
Similarity. In: SAC 2009, March 8-12 (2009)
15. Gu, G., Zhang, J., Lee, W.: BotSniffer: Detecting Botnet Command and Control Channels
in Network Traffic. In: Proceedings of the 15th Annual Network and Distributed System
Security Symposium (NDSS 2008) (2008)
16. Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming Botnets:
Signatures and Characteristics. In: SIGCOMM 2008, August 17-22 (2008)
17. Lu, W., Tavallaee, M., Ghorbani, A.: Automatic Discovery of Botnet Communities on
Large-Scale Communication Networks. In: ASIACCS 2009, March 10-12 (2009)
18. Chang, S., Daniels, T.: P2P Botnet Detection using Behavior Clustering & Statistical
Tests. In: AISec 2009, November 9. ACM, New York (2009)
19. Muhammad, M., Gao, J., Khan, L.: Peer to Peer Botnet Detection for Cyber-Security: A
Data Mining Approach. In: CSIIRW 2008, Oak Ridge, Tennessee, USA (2008)
20. Liu, L., Chen, S., Yan, G., Zhang, Z.: BotTracer: Execution-based Bot-like Malware Detection. LNCS. Springer, Heidelberg (2008)
Abstract. Security can be enhanced through wireless sensor network using contactless biometrics and it remains a challenging and demanding task due to several limitations of wireless sensor network. Network life time is very less if it
involves image processing task due to heavy energy required for image processing and image communication. Contactless biometrics such as face recognition
is most suitable and applicable for wireless sensor network. Distributed face
recognition in WSN not only help to reduce the communication overload but it
also increase the node life time by distributing the work load on the nodes. This
paper presents state-of-art of biometrics in wireless sensor network.
Keywords: Biometrics, Face Recognition, Wireless Sensor Network, Contactless Biometrics.
1 Introduction
Contactless biometrics enhances the security through wireless sensor network and it
remains a challenging task due to the limitations of wireless sensor network as compared to tradition system. Wireless sensor networks becomes the most important
technology and used in wide range of security applications epically for espionage,
target detection, habitat monitoring, military applications etc. [1-2]. Wireless sensor
networks are showing more interest by both theoretical and practical problems after
9/11 due to high demand of security application. The flexibility of wireless sensor
network for un-manned surveillance application make it more suitable for data transmission. The situation become more interesting when un-manned security application
finds the suspicious and send the recognized identity.
Wireless sensor network consist of low power, battery equipped operated nodes
used for remote monitoring. Generally, a wireless sensor node consists of low-power
digital signal processor, radio frequency circuit, micro- electro mechanical system, and
small battery. Wireless sensors are characterized by several constraints, such as poor
processing power, less reliability, short transmission range, low transmission data
rates, and very limited available battery power [3]. Wireless sensor network consists of
multiple sensor nodes which are able to communicate with each other in order to perform the computation collaboratively by efficiently dividing the workload and avoiding the high communication within themselves. Thus, to overcome the limitations of
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 236243, 2010.
Springer-Verlag Berlin Heidelberg 2010
237
sensor network, sensor nodes collect the information from each other and perform the
heavy task e.g. face recognition, object tracking by reducing the energy consumption
[4]. Sensor node has not enough capability to process the heavy data locally such as
image whereas image transmission is one of the most expensive task and it takes lot of
energy due to the lot of communication overheads [5]. The network life time is when
fist node died. Thus efficient distributed processing is required to overcome the issues
in sensor networks.
238
During the last few decades, computer scientists, neuroscientists and psychologists
are working on face recognition algorithms. The psychologists and neuroscientists are
modeling the visual perception for face recognition whereas the computer scientists are
trying to develop methods based on the human brain modeling [6-8]. Face recognition
includes face identification and face verification. Face identification shown in figure 1
is the one to many match in which a huge database is matched with probe image. It is
more challenging as compared to face verification. Face verification includes one to
one match and has been implemented in mobiles phone and personal login system by
FaceCode, OMRON etc. [9]. The face identification is the contactless biometrics
whereas face verification is partially contact biometrics. Face identification is the most
popular biometrics used for security purposes due to its ease of end-user use, identification of an individual from distance. The contactless property of face identification
make it more suitable for espionage applications by using wireless sensor networks.
PCA and LDA based methods are the most powerful methods for dimensionality
reduction and have been successfully applied in many complex classification problems
such as face recognition, speech recognition, etc [10]. LDA based methods perform
better than PCA while LDA based methods are facing problems with SSS. The aim of
LDA is to find the best representation of feature vector space. The conventional solution for small sample size problem and large data is the use of PCA into LDA. PCA is
used for dimensionality reduction and LDA is performed on to the lower dimensional
space obtained using PCA [11]. The use of LDA over PCA results in loss of significant
discriminatory information. Direct linear discriminant analysis (D-LDA) is used to
overcome this issue [12-13]. Fractional-step linear discriminant analysis (F-LDA) used
weighting function by assigning the more weight to the relevant distance for dimensionality reduction to avoid misclassification [14]. Razzak et al. used layered discriminant analysis to overcome the issue of small sample set and large dataset [15]. Small
dataset is extracted from large dataset using LDA instead of one single face. The further features templates are computed based on small new dataset. Finally the probe
image is projected to fine the best separablity criteria. Razzak et al. presented bioinspired face recognition system. The face dataset is reduced in layered process by
using the structural features and appearance based futures [16]. The dataset is reduced
in layered manner to find the best separablity.
239
it is better to process the image either locally or in distributed environment by allocating the resources to neighbor nodes in efficient way rather than transmitting to the
destination for recognition.
Muraleedharan et al. presented swarm intelligence based wireless face recognition
and ant system is used for routing the wavelet coefficient. The swarm intelligence is
used to optimize the routing in ant system in distributed time varying network by maintaining required bit error for various channel conditions [17]. The contourlet or wavelet
coefficients are transmitted for central processing and assigned higher priority to ensure more accurate transmission and achieve 94% accuracy. Muraleedharan et al. presented face recognition for single or multi-hop ant based wireless sensor network using
swarm intelligence [19]. They presented ant system for routing the wavelet or contourlet coefficients of faces to the sink node for processing with minimum energy consumption and reliable transmission. The swarm intelligence travel through the rout
with less load and transmission error. The selected rout is evident to be more efficient
and shorter. They evaluated three schemes raw format, compressed format by wavelet
and compress format by contourlet on their performance. The transformed coefficient
have different priority levels during image reconstruction.
Yan and Osadciw presented module (face, eye, nose, lips, forehead) based distributed wireless face recognition system [9]. They used five module (face and four sub
modules) for wireless face recognition. The wireless sensor networks is divided into
two groups i.e. feature nodes and database nodes. The feature nodes is responsible to
calculate the features of probe image and transfer these features to the cluster node.
The cluster node combines the features from all feature nodes and further transfers
them to the database nodes. Database node compares with the stored templates and
finally the score is transferred to the sink node. Although the workload is divided into
sensor nodes, but the communication overload between feature nodes itself and database nodes is still an issue. It is more pressure on feature nodes and database nodes.
Yan et al. presented contourlet based image compression for wireless communication
in face recognition systems [18]. Yan et al. presented multistep static procedure to
determine the confidence interval of features based on the Eigen decomposition of features and present a MIZM zone to present the interval [20].
240
Ikdong et al. presented face recognition system based on ZigBee transmission protocol and eigenface are computed with low power computation [24]. Yan et al presented discrete particle swarm optimization (DPSO) algorithm with a multiplicative
likeliness enhancement rule for unordered feature selection and DPSO is applied on
FERET face database [25]. The face recognition features pool are derived from
DFLDA and each practical is associated with features subset. Features are selected
based on assigned likeliness and the recognition performance is improved by both L1
and L2 norms distances metrics.
Razzak et al. present face recognition system in wireless sensor networks by efficiently reducing the dataset using layered process and presented three cases [22]. The
image is divided into four sub modules i.e. forehead, eyes, nose and lips. The layered
linear discriminant analysis is re-engineered to implement face recognition in wireless
sensor network and instead of considering one cluster head in feature nodes and database nodes, four cluster heads are considered for each module in both feature and database net. The local cluster of each module is responsible for internal processing. Moreover instead of projecting the feature space onto the whole dataset, the result of one
module is used to decrease the matching criteria for next module. It increased the computation overload on feature/database node while on the other hand communication
load between the feature node to source node and database node to sink node is reduce.
Razzak et al. presented distributed face recognition system by dividing the load on
the nodes and presented two different methods for training and recognition [23]. The
previous face recognition system in wireless sensor network is only for recognition
whereas the training is performed separately and feature matrix is stored on the feature
nodes. They present efficient distributed wireless face recognition and discussed both
training and recognition scenario by utilizing different algorithms. Instead of using one
algorithm for training and recognition, they used two different methods for training and
recognition. . Enrolment and identification of each sub modules is performed by separate cluster head. Each cluster head is responsible to process its sub module in distributed environment and each cluster head is responsible to communicate with sink cluster
241
which perform the score level fusion. The net features of both methods are same,
whereas both differentiate in computational complexity and consider separate cluster
head for each module i.e. fore head, eye, lips and nose. Only the local cluster is responsible for internal module processing for both training and recognition and used the result
of one module to reduce the matching dataset for other modes to find the best match and
save the energy instead of projecting the feature space onto the whole dataset. The enrolment of faces is performed using linear discriminant analysis of principle component
analysis and templates are stored in database nodes and the features and image templates of each sub module is stored on separate feature nodes and database nodes. For
recognition, the probe modules of probe image are projected onto feature space to find
the feature templates. These computed templates are compared with each templates
stored in the database net to find the most similar identity. Figure 3 and figure 4 shows
the distributed wireless sensor face recognition and enrolment system respectively.
242
2.
3.
Distribute the face recognition algorithm and perform the task in distributed
environment. Thus, distribution of face recognition according to sensor network capacity algorithm is required.
Instead of utilizing image based eigenface, face descriptor is better choice
due to less processing power and communication overhead. Although the
face descriptors provide less accuracy as compared to traditional system yet
it can be a better choice to optimized the energy.
Categorization of face image into sub classes based on the external clue is
required. This can help to reduce the processing cost by reducing the large
dataset into very small dataset.
References
[1] Estrin, D., Culler, D., Pister, K., Sukhatme, G.: Connecting the physical world with pervasive networks. IEEE Pervasive Computing 1(1), 5969 (2002)
[2] Pottie, G.J., Kaiser, W.J.: Wireless integrated network sensors. Communications of the
ACM 43(5), 5158 (2000)
[3] Zhang, M., Lu, Y., Gonh, C., Feng, Y.: Energy-Efficient Maximum Lifetime Algorithm
in Wireless Sensor Networks. In: 2008 International Conference on Intelligent Computation Technology and Automation (ICICTA) (October 2008)
[4] Razzak, M.I., Hussain, S.A., Minhas, A.A., Sher, M.: Collaborative Image Compression
in Wireless Sensor Networks. International Journal of Computational Cognition 8(1)
(March 2010)
[5] Hussain, S.A., Razzak, M.I., Minhas, A.A., Sher, M., Tahir, G.R.: Energy Efficient Image
Compression in Wireless Sensor Networks. International Journal of Recent Trends in Engineering 2(1) (November 2009)
[6] Jain, A.K., Ross, A., Pankanti, S.: Biometrics: a tool for information security. IEEE
Transactions on Information Forensics and Security 1(2), 125143
[7] Prasad, S.M., Govindan, V.K., Sathidevi, P.S.: Bimodal personal recognition using hand
images. In: Proceedings of the International Conference on Advances in Computing,
Communication and Control, pp. 403409. ACM, New York (2009)
243
[8] Ross, A., Jain, A.K.: Multimodal Biometrics: An Overview. In: 12th European Signal
Processing Conference (EUSIPCO), Vienna, Austria, pp. 12211224 (2004)
[9] Yan, Y., Osadciw, L.A.: Distributed Wireless Face Recognition System. In: Proc. IS&T
and SPIE Electronic Imaging 2008, San Jose, CA (January 2008)
[10] Ross, A., Jain, A.: Information Fusion in Biometrics. Pattern Recognition Letter 24,
21152125 (2003)
[11] Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs Fisherfaces: Recognition
using Class Specific Linear Projection. IEEE Transaction Pattern Analysis Machine Intelligence 19 (1997)
[12] Yang, J., Yang, J.Y.: Why can LDA be Performed in PCA Transformed Space? Pattern
Recognition 36 (2003)
[13] Yu, H., Yang, J.: A direct lda algorithm for high-dimensional data with application to
face recognition. Pattern Recognit. 34, 20672070 (2001)
[14] Lotlikar, R., Kothari, R.: Fractional-step Dimensionality Data with Application to Face
Recognition. IEEE Transaction Pattern Analysis Machine Intelligence 22 (2000)
[15] Razzak, M.I., Khan, M.K., Alghtabar, K., Yousaf, R.: Face Recognition using Layred
Linear Discriminant Analysis and Small Subspace. In: International Conference on Computer and Information Technology, UK (2010)
[16] Razzak, M.I., Khan, M.K., Alghtabar, K.: Bio-Inspired Hybrid Face Recognition System
for Small Sample Space and Large Data Set. In: 6th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Germany (2010)
[17] Muraleedharan, R., Yan, Y., Osadciw, L.A.: Constructing an Efficient Wireless Face
Recognition by Swarm Intelligence. In: 2007 AGEP Academic Excellence Symposium,
Syracuse, NY (June 2007)
[18] Yan, Y., Muraleedharan, R., Ye, X., Osadciw, L.A.: Contourlet Based Image Compression for Wireless Communication in Face Recognition System. In: Proc. IEEE-ICC 2008,
IEEE International Conference on Communications (ICC 2008), Beijing, China (2008)
[19] Muraleedharan, R., Yan, Y., Osadciw, L.A.: Increased Efficiency of Face Recognition
System using Wireless Sensor Network. Systemics, Cybernetics and Informatics 4(1),
3846
[20] Yan, Y., Osadciw, L.A., Chen, P.: Confidence Interval of Feature Number Selection for
Face Recognition. Journal of Electronic Imaging 17(1) (January 2008)
[21] Wu, H., Abouzeid, A.A.: Energy efficient distributed image compression in resourceconstrained multihop wireless networks. Computer Communications 28 (2005)
[22] Razzak, M.I., Khan, M.K., Alghtabar, K.: Distributed Face Recognition in Wireless Sensor Network. In: The FTRA 2010 International Symposium on Advances in Cryptography, Security and Applications for Future Computing, Korea (2010)
[23] Razzak, M.I., Almogy, B.E., Khan, M.K., Alghtabar, K.: Energy Efficient Distributed
Face Recognition in Wireless Sensor Network. Telecommunication System (accepted)
[24] Kim, I., Shim, J., Schlessman, J., Wolf, W.: Remote wireless face recognition employing
zigbee. In: Workshop on Distributed Smart Cameras, ACM SenSys 2006, USA (2006)
[25] Yan, Y., Kamath, G., Osadciw, L.A.: Feature selection optimized by discrete particle
swarm optimization for face recognition. In: Optics and Photonics in Global Homeland
Security V and Biometric Technology for Human Identification SPIE, vol. 7306 (2009)
Abstract. The purpose of this paper is to suggest security assured data communications architecture in net-centric defense systems based on DoDAF 2.0. This
architecture provides a finite security precision of network communication
within the defense network like C4I System. In this proposed network communication architecture where security is being prioritized, we propose three security mechanism levels, the authentication level, the Business Rules Repository
level & Security Rules Repository level and available techniques facilitating the
functionality of the levels. Security can be coerced at every stage of the data
transit. By utilization of various data security measures available, each level
will substantiate the security of the data in the communication chain from end
to end.
Keywords: Military Communications, C4I, Defense Communication Network,
DODAF, Secured Communications.
1 Introduction
Networks and communications networking is one of the most crucial components of
the military architecture systems like Command & Control (C2), Command, Control,
Computers, Communication & Intelligence (C4I), C4I Surveillance, Target Acquisition and Reconnaissance (C4ISTAR), Department of Defense Architecture Framework (DoDAF), The Open Group Architecture Framework (TOGAF) and others [1].
The first line of operations designing of a given military architecture is doctrine of the
communication heads and communicates (DoDAF OV4) [2]. This phase is actually
the one that serves as the layout for comprehending the methodology, type, requirement of the technology (DoDAF SV2) [2] of the network architecture. Hence, properly designed, defined and secured net-centric communication network architecture
equipped with fore-vision of approximate required changes in the future, change
complaint and change tolerant technological measures can contribute invariably to a
stable military system or system of systems like C4I and others, which bases in DoDAF [3]. The primary focus of this paper is to suggest a generic net-centric communication network model that can capture the ubiquitous participants of the military
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 244249, 2010.
Springer-Verlag Berlin Heidelberg 2010
245
2 Research Background
The C4I systems are used in various departments such as defense, police, investigation, road, rail, airports, oil and gas where command and control scenarios exist. The
main focus of these systems is in defense applications. C4I systems consist of people,
procedures, technology, doctrine and authority and play a growing role in information
management, data fusion and dissemination. The purpose of a C4I system is to help
the commander accomplish his objective in any crucial situation. It consists of
four words such as command, control, communications, computers and intelligence
[11], [12].
Military data communications systems are, by their nature, shared systems. Host
computers are shared by local and remote users; access lines are multiplexed; terminals are concentrated and interfaced to the network by terminal controllers; packet
switches interface host computers to the network and handle data traveling to and
from all users; and gateways handle data traveling among different networks. Then
users, cleared to various security levels, generate, manipulate, send, and receive data
at assorted levels of classification over a shared communications system, the potential
exists for security violations and dangerous compromise [4].
246
CENTRAL COMMAND
247
AUTHENTICATION SERVER
IP NETWORK
BUSINESS
RULES
REPOSITORY
FIREWAL
CCR 1
NODE 1
NODE 2
CCR 2
CCR 3
NODE 3
5.1 Level I
At this level, the concentration of the security aspect will be: role based authentic
clearance and ensuring the security of the data in an encrypted form using cryptographic functionality. The cryptography [7] mechanism is considered to be the most
suitable, when it comes to communicating classified information, high level security
data among the roles of a military system. The different type of cryptography methodologies available are 1) Secret Key cryptography 2) Public Key Cryptography.
The symmetric cryptography can be mathematically represented as:
Encryption: C=Encrypt (K, M)
Decryption: M=Decrypt (K, C)
Message: M=Decrypt (K, Encrypt [K, M])
(1)
(2)
Then the data is encrypted using a specific roles (R1) public key, it can be decrypted
using R1s private key only (Restricted Access).
Encryption: C=E (KR1-pub, M)
Decryption: M=D (KR1-priv, C)
(3)
Another way of securing the data before it is being initiated for transmission is to
scrutinize the data with a digital signature. Firstly, the data is encrypted by senders
private key C=E (Ks-priv, M), then this encrypted data is re-encrypted using the public
key of the recipient C2=E (KR-pub, C). The recipient using its private key there M(C)
=D (KR-priv, C2) and then using the senders public key M=D (KS-pub, C) will decrypt
the encrypted data.
248
5.2 Level II
Post receipt of the data packets, role authenticated and double encrypted the authentication server will forward the data packets the Communication Controllers (CCR) as
depicted in figure 3. Generally, in shared system communications the security policy
implemented is end-to-end encryption. In this proposed network architecture the requirement of the clear text headers is a mandate. The information will be utilized in
order to ascertain the business rules for further securing the data at level 3. At this
level, once the headers are read by the CCRs, the CCRs will communicate with the
Business Rules Repository to request the communication security rules in accordance
with the role of the communicates. Business Rules for Communication i.e., Designation of the Sender (Ds), Designation of the Receiver (Dr) and Severity of the Message
being transmitted (Xn). Mathematically represented as:
Roles Designation: 1 =Low, 2= average and 3= high Data Severity: 1=Normal,
2=Essential, 3= Extremely Critical.
Sr3 = Dr3(X3, Ds3)
(4)
Once the severity of the data is identifies the BRR will stamp the data packets with Sr3
(1), which implies Security rule 3. Similarly communication of the data between a
sender and multiple receivers can also be denoted by
Sr2=Dr1(X2, D s [2, 3, 1])
(5)
To facilitate the BRR functionality, whose emphasis is: 1) Identification of the participants. 2) Participants are authenticated users. 3) The role of the sender privileged
enough to raise the given priority. Various models have been suggested to comply
with the authentic communication under the banner of RBAC (Role Based Access
Control) [8]. Three primary rules are defined for RBAC: (i) Role assignment, (ii)
Role authorization: A subject's active role must be authorized for the subject. (iii)
Transaction authorization: A subject can execute a transaction only if the transaction
is authorized for the subject's active role. The RBAC can be customized into an application model [8].
5.3 Level III
Security Rules Repository: The Business Rules Repository application can be customized to add the Required Security details (RSD) for the data. On receipt of the
data packets, Security Rules Repository (SRR) will Ensure: 1) Integrity of the data. 2)
Confidentiality of the data. Integrity service [9]. Where, message hashing will concern the data with hashes and digital encryption will encrypt the hashes using the
public and private keys, figure 3. This can be represented as:
Hash (H) = PSS-10, C1
(6)
Where PS= Packet Size, S=Sender and Cn=Check Sum. Once the data packet bundle
has been hashed by the Message Hashing this has will be encrypted using the digital
signature, via the public key of the recipient.
Encryption= C (H) = (KR-Pub, H)
(7)
249
References
1. Bayne, J.S.: A Theory of Enterprise Command and Control. In: IEEE MILCOM, paper
10.1109/MILCOM, 302994, pp. 18 (2006)
2. DoD Architecture Framework, Version 2.0 DoDAF Meta Model, vol. 3, Department of
Defense, USA (2009)
3. Mahbub, H., Raj, J.: High Performance TCP/IP Networking concepts, Issues and Solutions. Prentice Hall of India (2004)
4. Anurag, D., Brain, H., Jhon, N.: Capacity planning strategies for net-centric applications.
In: MILCOM IEEE, vol. 3, pp. 16861692 (October 2005)
5. Rona, S., Caser, D.: Computer Security and Networking protocols: Technical Issues in
Military Data Communications Networks. IEEE Transactions on Communications 9,
14721477 (1980)
6. Abdullah, A.S., Tazar, H., Gulfaraz, K.: Enhancing C4I Security using Threat Modeling.
In: Proc. IEEE, UKSIM, paper 10.1109 (2010)
7. Stallings, W. (ed.): Cryptography and Network Security, 4th edn. Prentice-Hall o f India
8. Wei, Z., Meinel, C.: Team and Task Based RBAC Access Control Model. In: Network
Operations and Management Symposium, LANOMS 2007. Latin American, paper
10.1109, pp. 8494 (2007)
9. Veselin, T., Dragomir, P.: Information Assurance in C4I Systems. ISN Information and
Security 4, 4359 (2000)
10. Alghamdi, A.S.: Evaluating Defense Architecture Frameworks for C4I System Using Analytic Hierarchy Process. J. Computer Sci. 5(12), 10781084 (2009)
11. Alghamdi, A., Shamim Hossain, M., Al Qurishi, M.: Selecting the best case tools for DoDAF-based C4I applications, inderscience Int. J. Advanced Media and Communication
(2010) (in press)
12. Ahmad, I., Abdullah, A.B., Alghamdi, A.S.: Evaluating Intrusion Detection Approaches
Using Multi-criteria Decision Making Technique. IJISCE 1(2), 6067 (2010)
13. Ahmad, I., Abdullah, A.B., Alghamdi, A.S.: Towards the designing of robust IDS through
an optimized advancement of neural networks. In: Kim, T.-h., Adeli, H. (eds.)
AST/UCMA/ISA/ACN. LNCS, vol. 6059, pp. 597602. Springer, Heidelberg (2010)
14. Ahmad, I., Abdullah, A.B., Alghamdi, A.S.: Application of Artificial Neural Network in
Detection of DOS Attacks. ACM SIN, 229234 (2009)
1 Introduction
The January 12, 2010 earthquake completely compromised Haitis already unstable
infrastructure, rendering local communications useless for coordinating relief and aid.
The ensuing chaos highlighted the lack of redundant information systems that were
versatile and accessible by the Haitian people and relief community. Thermopylae
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 250258, 2010.
Springer-Verlag Berlin Heidelberg 2010
251
252
efforts, were hindered by the absence of an accessible and central environment for
data and cooperation.
The SOUTHCOM 3D UDOP was initially an extension of SOUTHCOMs preexisting and non-geospatial collaboration portal, the All Partner Access Network which
provided image, document and comment sharing capabilities crucial to information
management within SOUTHCOMs disaster relief network, which integrated nonmilitary participants. Although this network was widely used, users made more time
for the UDOP when it was accessible on the Web outside of military networks, and
eventually became a permanent fixture in the SOUTHCOM Crisis Action Center. Its
first deployment was made just days after the earthquake leveraging Thermopylaes
past experience as an enterprise partner with Google Earth and tracking vehicles and
security assets, and rapidly spiraled updates installed frequent improvements. Their
information management requirements, which the UDOP supported, were defined by
a characteristic atypical of military operations, but necessary to address the crisis.
Virtually all information available and pertinent to Haiti was unclassified according
SOUTHCOM, meaning an inclusive environment was possible to engage the international community. While this opened the door to more information exchange, complications of language arose, Haiti being a multilingual nation itself, but geospatial
display is intuitive and transcends some of these problems. Location is a common
language to everyone.
Considering this, the UDOP was built for a large, disparate user community which
came with the profound international response. A frequent limitation of GIS is that
users are often dependent on a cadre of specialized geospatial technicians to collect,
analyze, and disseminate data often only as printed maps. This practice can easily be
overwhelmed so the greater user community is left out of the process and is passively
involved at best. The UDOP allowed everyone to add information and visualize it
without the need for a geospatial office to facilitate, making it impressively scalable.
High user-volume is further enabled through the browser-based software, which is
253
familiar from Google Earths commercial use, and accessibility on multiple platforms,
mobile and stationary. Crowd sourcing and social networking complement the content
added from other sources and the effect of layering disparate information together in
user-defined combinations opens up new ways of thinking through improved situational awareness. With this wide range of content, that can be analyzed through the
UDOPs collaborative tools, data is now communicable in a geospatial sense without
the need to print and can be used for strategic communications. No longer
are users limited to just the information one managing organization thinks they want
to see.
3 User-Created Content
The UDOP Haiti project addressed the challenge of creating a common picture of
spatially-relevant data applicable to Haiti after the earthquake. The initial operating
capability had three core features to support data collaboration. The first feature allowed a user to import any geospatial file and load it to the UDOP view. The second
allowed users to link to a URL where users could send dynamic updates of their existing data stores to other spatial files and render them in the UDOP. The third feature,
which was imperative to promoting an environment of sharing, was a spatial content
export tool, which allowed users to use the UDOP as a "marketplace" of spatial data,
even if users didnt ultimately use the tool for fusing the information into a single
picture. This ensured that the UDOP served a dual purpose as both a repository
within which content could be created or viewed, and as an index of available content,
which was critical to its widespread use as an environment for data sharing.
The system design team benefited tremendously by communicating with relief
units in Miami, Fl. and Haiti, and the final system was heavily influenced by the
individuals directly involved in coordinating logistics amongst all participating
organizations.
The ability to link spatial data layers from sources outside of the UDOP is a key
feature, giving the user the ability to define a custom view through their browser. An
early concern was that the KML folder would easily become overloaded with disorganized layers of content, thus undermining the UDOPs purpose. As the volume
of data increased, two additional tools, one allowing for the creation of lines, polygons, and points, along with a common labeling and icon scheme, rendered the
content much more easily visualized on the UDOP globe which was a customized
with almost daily updates of imagery in which the quality, origin, and time were
defined by users.
Users could build video fly-through presentations or even construct a series of life
views of current imagery, intelligence layers, and operations data that dynamically
updated even after the presentation was compiled. The data was live, up-to-date and
interactive, so when it was presented a user could stop at any point and drill down into
a greater level of detail for facilities, mobile units, and landmarks as well as basic
reporting and imagery. This supported an extremely adaptive environment for the
intelligence personnel supporting the relief effort.
254
255
4 Mobile Applications
One of the most effective methods for building knowledge within the system was to
collect data directly from those individuals in the field during the crisis. Leveraging
mobile applications is important because of their range of use and devices, importance
to crowd sourcing, quick deployment, and their ability to become immediately omnipresent. To this end, SOUTHCOM officials took advantage of smart phone (i.e.
iPhone or Android) technology and integrated customized software on various
mobile platforms that complemented the UDOP, allowing relief workers to share and
post data about their immediate surroundings in real-time. Integration of mobile applications leveraged the comprehensive 2D version of Google Maps for mobile
phones and will soon include mobile Google Earth. This included uploading geolocated text and descriptions, as well as photographs taken using the in-house developed mobile application Disaster Relief, and using GeoCam, jointly developed by
NASA and Google. The UDOP also leveraged existing crowd source tools include
Ushahidi-based services that generated layers from geolocated SMS messages sponsored by local telecoms immediately after the earthquake.
256
These applications provided quick and easy access to critical knowledge on an asneeded basis, such as locating the nearest medical facility or relief camp. An important additional feature of the application was an offline mode that allowed users to
store data on their mobile device locally before finding a connectivity point, such as a
Wi-Fi hotspot or a working cell phone tower. Once connectivity was available, the
mobile application prompted the user to submit the locally-stored data. The "Haiti
Relief" application was created initially using the iPhone software development kit
and was made freely available within Apple's "App Store," and allowed ad hoc
distribution. The application was also ported to Android and other popular mobile
platforms. By combining these resources, SOUTHCOM leaders were able to build a
full-featured solution, supporting the sharing of information among relief supporters
and coordinators, as well as enabling direct interaction with those on the ground in
Haiti.
257
this small cross-section, they were able to capture the groups prioritized needs, and
then expand upon the capabilities the entire user community needed. Furthermore,
once brief training videos were viewable directly from the UDOP splash page, requests for support dropped by 70 percent. The training material allowed users from a
host of varied backgrounds to become familiarized with the UDOPs capabilities
quickly and then pas that knowledge on to other users. The UDOP training program
was created to track the standard operating procedures created by SOUTHCOM users.
The program was structured to follow the expected evolution of content management,
and the training schedule was coordinated with software update releases. The urgency
and sudden commencement of relief efforts in Haiti entailed new incoming personnel
daily, which necessitated continual training. Users could also operate with enhanced
knowledge of Haiti before setting foot there, as Thermopylaes iHarvest profiled their
UDOP use in the background and matched them with similar users and content as it
was created, immersing users in the content base and improving collaboration. As
SOUTHCOMs relief efforts geared up, the UDOP user base expanded with the addition of individuals from other Combatant Commands, primarily Joint Forces Command and Northern Command. Integrating disparate groups is important to both
organizations because of the frequent need to include allied, federal, state, municipal,
and NGO groups, a situation similar to the immediate nature of relief operations that
teams groups with similar functions but unintegrated organization. The cumulative
diversity and density of data that could be layered was testament to the demand for
agility and innovation in content management. Through the UDOP a new GIS capability evolved in that software no longer just processing data, but fusing an ad hoc
community of like-minded people and organizations in one place. Through the simplicity of linking dynamic content in, data custodians retained control of their own
content and could reveal it on the UDOP as desired, but without having to conform
their information systems to each other. Smoke stacks of information no longer
258
emptied into the sky, by into one contained space equipped with tools to manage the
diversity and variation. The relief community is heading towards a new expectation of
continuity as the same groups which were gathered together on the UDOP, can immediately reconnect as they respond to future disasters in other locations.
6 Conclusion
The UDOPs ability to fill the former absence of the collaborative coordination of
geospatial data for disaster relief efforts is one of the most notable and valuable aspects of the software. Thousands of users from partner nations, non-government
organizations, and local and state first responders had a central location to access and
share relevant intelligence. The UDOP allowed users to represent the human element
of what was occurring on the ground in Haiti. From geo-tagged snapshots of infrastructure and floodplains on smart phones, to the creation of adaptive spot-reporting
layers, the UDOP flexed to meet a variety of needs for relief workers and users can
anticipate exciting developments in addressing these and future challenges.
During this project, much of the innovation was driven by the relief coordinators
immediate need for these tools. Due to the urgency for a timely response, SOUTHCOM officials broke from the implicit military norm and embraced an inclusive,
collaborative, and open environment. The unanticipated synergy amongst users fostered profound innovation that quickly led to the first, second and third generations of
improvement. The UDOP for Haiti demonstrated that if users are intellectually involved in improving the technical capability, they have a vested and therefore greater
interest in the application. Also, having a committed, responsive design team capable
of integrating features in a matter of days versus weeks is imperative to retain user
buy-in. The humanitarian community at large benefits from this rapid development of
tools as they are tailored to the most serious challenges. As the technical capabilities
for sharing data increase over time, new ideas will be formed by users as they reset
their understanding of the high-water mark of the inclusive sharing of spatial data.
1 Introduction
Habitat is the space and environment of biological survival and reproduction, and it is
vital for the food chain and energy flow of biology that is the foundation of natural
environment to keep healthy. Good habitats of ecological condition can nurture a good
ecological quality. Because of human scale development of biological habitat, it will
seriously affect the local ecological quality, and will further bring the ecological disaster [1-3]. The key to protecting habitat is to explore impact factors of the habitat, and
build up a model to simulate and evaluate the habitat.
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 259268, 2010.
Springer-Verlag Berlin Heidelberg 2010
260
Since 1970s, numbers of habitat evaluation studies were undertaken. Bovee was the
first person to apply the Instream Flow Incremental Methodology (IFIM) into the
habitat evaluation procedure [4, 5]. Zengchao Hao used a multi-objective assessment
method based on physical habitat simulation for calculating ecological river flow demand in the mid-reach of the Erqisi River in Xinjiang Autonomous Region [6]. Hayes
used the Logistic regression based on habitat preference models to predict the suitable
habitat in the Three New Zearland River [7]. Binns developed a Habitat Quality Index
model to predict trout standing crop in Wyoming streams [8].
All those methods mentioned above obtained fruitful results, but they also have
some limitations as follows [9]. not considered to combine the water quality model;
high requirements for parameter selection;
lack of wide adaptability.
In order to overcome the limitations, HABITAT model was developed by Deltares
Delft Hydraulics. HABITAT model is able to translate results of hydrological, hydraulics and water quality models into effects on the natural environmental and human
society. It is also easy to setup and can be developed at different spatial scales for
different predictive purposes. In this paper, a habitat assessment of Schizothorax
Yunnansis was built up based on HABITAT.
Z = exp(a b1 x1 b2 x2 b3 x3 + b4 x4
+ b5 x2 x3 + b6 x3 x4 + b7 x1 x3
(1)
+ b8 x2 x3 x4 )
where Z is the habitat suitability of evaluation species; x1 is water depth, the value
between 0.2m to 3.5m; x2 is river/lake temperature, the value between 0 to 40 ; x3 is
flow velocity, the value between 0m/s to 2m/s; x4 is the concentration of TP; a is constant; b1 L b8 are regression coefficients;
The equation (1) is only valid the concentration of TP between 0 and 25g/l. When
the concentration of TP exceeds 25g/l, the relationship between the suitability Z and
these parameters is transformed to the follow equation:
Z = exp(c d1 x1 d 2 x2 d3 x3 d 4 x4
261
(2)
+ d5 x2 x4 + d 6 x3 x4 d 7 x2 x3 x4 )
where x4 is the concentration of TP, the value large than 25g/l ; c is constant; d1 L d 7
are regression coefficients;
All the regression coefficients can be get from HSI value using multiply regression.
The probability of occurrence of evaluation species is presented as below.
P ( Z ) = Z / (1 + Z )
(3)
where P represents the probability of occurrence. It can be shown that the probability of
occurrence for each location, each area, or each computational grid cell. The probability of occurrence can be converted to a habitat suitability index value.
2.2 Habitat Suitability Index
Habitat quality for selected evaluation species is documented with a Habitat Suitability
Index (HSI) value. This value is derived from an evaluation of the ability of key habitat
components to supply the life requisites of selected evaluation species. If the HSI value is
equal to 1, this indicates that the suitability of the location is optimal for a certain habitat
type. If the HSI value is lower than 1, this indicates that one or more of the parameters
included in the model is limited. When the HSI value is 0, the location is not suitable[11].
HABITAT model allows users to analyses the ecological functioning of study areas
in an integrated and flexible manner. The steps are described as follows: investigate
the species information in study area from site visit; build up the HABITAT model;
prepare the relative GIS maps; estimate the effect of water quality parameters on
the habitats in study area;
run the model and get the result.
The schematic diagram of the HABITAT model is shown in Figure 1.
HABITAT Modeling
Input Map
Abiotic modeling
Knowledge
Rules
Hydrodynamic
Modelling
Water Quality
Modelling
Results
Species Thresholds
Result Map
Habitat
Suitability
Index
Index
Species
Habitat Units
262
3 Case Study
3.1 Study Area
263
The criteria applied in the paper for selection of evaluation species are: sensitivity to
environmental factors change; ecological importance in the local community and in
the food chain;
perceived value to human society; and
availability for habitat
suitability functions or information that could be translated into suitability functions.
Based on the site visit and combined the criteria for selection of evaluation species, the
evaluation species of this impact analysis focused on the special species of Yunnan
Province: Schizothorax Yunnansis. Schizothorax Yunnansis is a unique fish species in
Lugu Lake. Now, it is in endangering status. Sand, mud, and rock- habitat were not
considered in this impact assessment. All the habitat suitability functions of the
Schizothorax Yunnansis were found in the literature.
The HSI value of temperature for Schizothorax Yunnansis in this study is received from
literature review. The HSI Graph of Schizothorax Yunnansis is shown in Figure 4. From
the Figure 4, it can be seen very clearly that the temperature at a certain location is less
than 26C, but above 18C, the suitable value for Schizothorax Yunnansis is 1. When
temperatures are above 39C or less than 3C, no Schizothorax Yunnansis will survive
at that location. The tolerance limits of Schizothorax Yunnansis for TP is in the range
from 10g/l to 100g/l. When the concentration of TP in the range from 12g/l to
25g/l, the HSI value equals to 1; when the TP concentration increases all the way, the
HSI value will decrease to 0. When the TP concentration exceeds 100g/l, this concentration is considered lethal[13-15].
1
I
S 0.5
H
I
S 0.5
H
0
0
10
20
30
0 10 20 30 40 50 60 70 80 90 10
0
40
TemperatureC
Concent r at i on of TP g/ l
1
0.8
1
0.8
I 0.6
S
H 0.4
I 0.6
S
H 0.4
0.2
0.2
0
0
0.5
Velocity m/s
1.5
0.5
Depth m
1.5
264
The study area was meshed and the meshing figure of its water depth and temperature
were created according to the actual investigation. Water pollution is mainly due to the
Lugu Lake tourist and local residents living source, considering the Lugu Lake tourist
is in Dazu village and Luoshui village compared with developed area, therefore the two
villages are regarded as the source of pollution. The Delft-3D water quality model was
built up and forecast results in 2010 and 2060 can be obtained, which can be used as
input data in the HABITAT model.
3.5 Different Scenarios
At the beginning of simulation, the original situation of the evaluation species habitat
distributions in the Lugu Lake should be considered. Because Lugu Lake belongs to two
provinces, there are four assumptions presented here: (a) there is no pollution falling into
the Lugu Lake; (b) the pollution only falls into Yunnan Province part; (c) the pollution
only falls into Sichuan Province part; (d) the pollution falls into the two provinces.
The habitat changes in the year 2010 and 2060 are presented for the four assumptions respectively. The concentration of total phosphate (TP) is assumed to be increased
by 4% per year from 2010 to 2060, and the water quality results maps are provided by
the Delft3D-WAQ model results. Based on the four assumptions, the different Scenarios in Lugu Lake are shown in Table 1.
Table 1. Description of the Different Scenarios in Lugu Lake Case
Scenarios
Evaluation Species
Decision rules
Pollution
Scenario 1
Schizothorax Yunnansis
Temperature/
Depth/Velocity
NA
Scenario 2
Schizothorax Yunnansis
Temperature/
Depth/Velocity/TP
Luoshui
Village
Scenario 3
Schizothorax Yunnansis
Temperature/
Depth/Velocity/TP
Dazu
Village
Scenario 4
Schizothorax Yunnansis
Temperature/
Depth/Velocity/TP
Both
Years
2010
2060
2010
2060
2010
2060
2010
2060
The model results are presented in terms of changes in the water quality parameters of
the four scenarios. The water temperature is considered to be 18 C and the results are
discussed below.
(a)
265
(b)
The pollution only falls into Lugu Lake from Luoshui village
The habitat distributions for Schizothorax Yunnansis in Scenario 2 are shown in
Figure 6. Compared with the Figure 5, the habitat suitability area of Schizothorax
Yunnansis near the Luoshui village increased, which means more appropriate
Schizothorax Yunnansis survival. The reason is that the emissions of phosphate in a
certain range can be as nutrients of Schizothorax Yunnansis, thus to facilitate its
growth. But to 2060 years, along with the TP concentration increasing, toxic is produced in the body of Schizothorax Yunnansis and cause Schizothorax Yunnansis to be
death. Therefore, the water near the Luoshui village has completely unsuited for
Schizothorax Yunnansis survival.
(a)
(b)
266
The pollution only falls into Lugu Lake from Dazu village
The habitat distributions for Schizothorax Yunnansis in Scenario 3 are illustrated in
Figure 7. It can be seen the habitat suitability area near Dazu village was increasing due
to the same reason of scenario 2. However, the water quality near Dazu village is unsuitable for the survival of Schizothorax Yunnansis in 2060.
(a)
(b)
The pollution falls into Lugu Lake from Luoshui village and Dazu village
The habitat distributions for Schizothorax Yunnansis in Scenario 4 are shown in
Figure 8. When the lake received the TP pollution from the two villages, which arrive at
12g/l (Dazu village) and 15g/l (Luoshui village), respectively, the result maps show
that the evaluation species area increased. That is because the TP concentration increased from 2g/l to 12g/l and 15g/l, the HSI value also increased to 0.8 and 1.0.
(a)
(b)
267
Therefore, more Schizothorax Yunnansis can live in that location and habitat suitability
area increases. But 50 years later, the TP concentration grows to 85.3g/l and
106.6g/l, and the HSI decreases to 0.2 and 0, respectively. Then, the habitat suitability
area for Schizothorax Yunnansis decreases significantly in 50 years.
From all the figures, it can be seen very clearly when the water pollutants concentration does not exceed Schizothorax Yunnansis existent threshold, the suitable habitat
area for Schizothorax Yunnansis increases, otherwise the unsuitable habitat area
increases.
4 Conclusions
This paper built an assessment model of the water quality and the Schizothorax Yunnansiss habitat using habitat evaluation method based on HABITAT model. The
results show that when the Lugu Lake water pollutants concentration does not exceed
Schizothorax Yunnansis survival threshold, Schizothorax Yunnansis gets more nutrients and suitable habitat area increased. Conversely, when the Lugu Lake water
pollutants concentration exceeds Schizothorax Yunnansis survival threshold, it may
even cause Schizothorax Yunnansiss death and unsuitable habitat area increased.
The results presented in this paper is shown that HABITAT model can assist in
ecological impact assessment studies by translating results of hydrological, water
quality models into effects on the natural environment and human society. It also can
provide the basis of watershed comprehensive research and scientific management for
the southwestern region of China.
Due to the lack of much monitoring data, the results biased to the actual situation in
this work. In addition, only four factors are considered: the temperature, the depth of
water, the flow velocity and the TP concentration. The four factors providing existent
environment for species are considered to be independent, but they have some intertwined relationship. Therefore, in the future works, bias analysis of computing results
and synthetically affecting species survival environment by four factors should be
reinforced.
References
1. Mynett, A.: Science-Technology and Management Panel, Hydroinformatics in ecosystem
restauration and management. In: 3rd World Water Forum, Japan (2003)
2. Mynett, A.: Hydroinformatics Tools for Ecohydraulics Modeling. In: International Conference on Hydroinformatics, Singapore (2004)
3. Knudby, A., Brenning, A., LeDrew, E.: New approaches to modelling fish-habitat relationships. J. Ecological Modelling 221, 503511 (2009)
4. Bovee, K.: Guide to stream habitat analysis using the instream flow incremental methodology. Western Energy and Land Use Team, Office of Biological Services, Fish and Wildlife Service, vol. 248, U.S. Dept. of the Interior (1982)
5. Bovee, K., Lamb, B., Bartholow, J., Stalnaker, C., Taylor, J., Henriksen, J.: Stream habitat
analysis using the instream flow incremental methodology. Biological Resources Division
Information and Technology Report USGS (1998)
268
6. Hao, Z., Shang, S.: Multi-objective assessment method based on physical habitat simulation
for calculating ecological river flow demand. J. Journal of Hydraulic Engineering 39,
557561 (2008)
7. Hayes, J., Jowett, I.: Microhabitat models of large drift-feeding brown trout in three New
Zealand rivers. J. North American Journal of Fisheries Management 14, 710725 (1994)
8. Binns, N., Eiserman, F.: Quantification of fluvial trout habitat in Wyoming. J. Transactions
of the American Fisheries Society 108, 215228 (1979)
9. Haasnoot, M., Verkade, J., Bruijn, K.: HABITAT, A Spatial Analysis Tool for Environmental Impact and Damage Assessment. In: 8th Hydroinformatics Conference, Chile (2009)
10. Haasnoot, M., Kranenbarg, J., Van Buren, J.: Seizoensgebonden peilen in het IJsselmeergebied, Verkenning naar optimalisatie van het peil voor natuur binnen de randvoorwaarden
van veiligheid, scheepvaart en watervoorziening (2005) (in Dutch)
11. Users manual and exercise of Physical Habitat Simulation Software: USGS
12. Jin, J., Yang, F.: Preliminary analysis on the hydrological Characteristics of Lake Lugu. J.
Chinese Academy of Sciences 3, 214225 (1983)
13. Chen, Y., Zhang, W., Huang, S.: Speciation in Schinzothoracid Fishes of Lugu Lake (in
Chinese). J. Acta Zoologica Sinica 28, 217225 (1982)
14. Chen, Y.: The Fishes of the Hendduan Mountains Region (in Chinese). Science Press, Beijing (1998)
15. Jrgensen, S., Costanza, R., Xu, F.: Handbook of ecological indicators for assessment of
ecosystem health. CRC Press, Florida (2005)
Abstract. An Application programming interface or API is an interface implemented by a software program that enables it to interact with other software.
Many companies provide free API services which can be utilized in Control
Systems. SCADA is an example of a control system and it is a system that collects data from various sensors at a factory, plant or in other remote locations and
then sends this data to a central computer which then manages and controls the
data. In this paper, we designed a scheme for Weather Condition in Internet
SCADA Environment utilizing data from external API services. The scheme was
designed to double check the weather information in SCADA.
Keywords: SCADA, Control Systems, API, Control Systems.
1 Introduction
SCADA stands for Supervisory Control and Data Acquisition. SCADA refers to a
system that collects data from various sensors at a factory, plant or in other remote
locations and then sends this data to a central computer which then manages and controls the data. Data acquisition is the process of retrieving control information from the
equipment which is out of order or may lead to some problem or when decisions are
need to be taken according to the situation in the equipment. So this acquisition is done
by continuous monitoring of the equipment to which it is employed. The data accessed
are then forwarded onto a telemetry system ready for transfer to the different sites. To
improve the accuracy of data and to improve the performance of SCADA systems, we
design a double checking scheme for Weather. This scheme uses data from weather
API Providers. Many API Provider such as Google, Yahoo, etc have Weather APIs.
Weather APIs can give weather condition and forecast about a specific place.
270
hardware and are usually sold together. The main problem with these systems is the
overwhelming reliance on the supplier of the system. Open software systems are designed to communicate and control different types of hardware. It is popular because of
the interoperability they bring to the system. [1] WonderWare and Citect are just two of
the open software packages available in the market for SCADA systems. Some packages are now including asset management integrated within the SCADA system.
Typically SCADA systems include the following components: [2]
1. Operating equipment such as pumps, valves, conveyors and substation breakers that
can be controlled by energizing actuators or relays.
2. Local processors that communicate with the sites instruments and operating
equipment.
3. Instruments in the field or in a facility that sense conditions such as pH, temperature,
pressure, power level and flow rate.
4. Short range communications between the local processors and the instruments and
operating equipment.
5. Long range communications between the local processors and host computers.
6. Host computers that act as the central point of monitoring and control.
The measurement and control system of SCADA has one master terminal unit (MTU)
which could be called the brain of the system and one or more remote terminal units
(RTU). The RTUs gather the data locally and send them to the MTU which then issues
suitable commands to be executed on site. A system of either standard or customized
software is used to collate, interpret and manage the data.
Supervisory Control and Data Acquisition (SCADA) is conventionally set upped in
a private network not connected to the internet. This is done for the purpose of isolating
the confidential information as well as the control to the system itself. Because of the
distance, processing of reports and the emerging technologies, SCADA can now be
connected to the internet. This can bring a lot of advantages and disadvantages which
will be discussed in the sections.
271
Conventionally, relay logic was used to control production and plant systems. With
the discovery of the CPU and other electronic devices, manufacturers incorporated
digital electronics into relay logic equipment. Programmable logic controllers or PLC's
are still the most widely used control systems in industry. As need to monitor and
control more devices in the plant grew, the PLCs were distributed and the systems
became more intelligent and smaller in size. PLCs (Programmable logic controllers)
and DCS (distributed control systems) are used as shown in Figure 1.
2.1.1 Hardware and Software
SCADA systems are an extremely advantageous way to run and monitor processes.
They are great for small applications such as climate control or can be effectively used
in large applications such as monitoring and controlling a nuclear power plant or mass
transit system.
SCADA can come in open and non proprietary protocols. Smaller systems are extremely affordable and can either be purchased as a complete system or can be mixed
and matched with specific components. Large systems can also be created with off
the shelf components. SCADA system software can also be easily configured for
almost any application, removing the need for custom made or intensive software
development.
2.1.2 Human Machine Interface
SCADA system includes a user interface which is usually called Human Machine
Interface (HMI). The HMI of a SCADA system is where data is processed and
272
Most operating environments, such as MS-Windows, provide an API so that programmers can write applications consistent with the operating environment. Although
APIs are designed for programmers, they are ultimately good for users because they
guarantee that all programs using a common API will have similar interfaces. This
makes it easier for users to learn new programs. [5]
Many API Provider such as Google, Yahoo, etc have Weather APIs. Weather APIs
can give weather condition and forecast about a specific place.
273
3 Proposed Solution
Weather APIs can be integration to SCADA systems to double check the weather
condition. Weather sensors of SCADA systems may not gather correct data. This is
very crucial and integration of APIs can improve the data gathered.
Fig. 5. SCADA Service Provider getting information from API service server
SCADA controller or SCADA master station can get both data from the sensor (x)
and the data from the Weather API (). Usually, the controller only bases the commands on the sensor data. Since we integrate the Weather API to the system, we can
also gather its data and we propose to get the average between the Sensor data and the
APIs data to get the base data () in which the commands will be based.
274
= (x + ) / 2
(1)
Formula (1) will be the bases of the SCADA Controller in executing commands to the
remote terminals. In Figure 6, we can see the comparison between the gathered Sensor
data, API data and the average data. We will notice that theres sometimes a difference
between the Sensor data and API data.
Fig. 6. Comparisons of Gathered Sensor Data, API Data and the Average
4 Conclusion
SCADA systems are used to monitor and control a plant or equipment in industries
such as telecommunications, water and waste control, energy, oil and gas refining and
transportation. A SCADA system gathers information, such as where a leak on a pipeline has occurred, transfers the information back to a central site, alerting the home
station that the leak has occurred, carrying out necessary analysis and control, such as
determining if the leak is critical, and displaying the information in a logical and organized fashion. SCADA systems can be relatively simple, such as one that monitors
environmental conditions of a small office building, or incredibly complex, such as a
system that monitors all the activity in a nuclear power plant or the activity of a municipal water system.The data that is gathered by the system is very important. The
system reacts to the data it gets. Imagine what will happen if the data is not accurate. It
can damage the society. To improve the accuracy of data and to improve the performance of SCADA systems, we design a double checking scheme for Weather
Condition in Internet SCADA Environment. This scheme uses data from weather
API Providers. Many API Provider such as Google, Yahoo, etc have Weather API
services.
Acknowledgement. This work was supported by the Security Engineering Research
Center, granted by the Korean Ministry of Knowledge Economy.
References
1. Bailey, D., Wright, E.: Practical SCADA for Industry (2003)
2. Hildick-Smith, A.: Security for Critical Infrastructure SCADA Systems (2005)
3. Wallace, D.: Control Engineering. How to put SCADA on the Internet (2003),
http://www.controleng.com/article/CA321065.html
(accessed: January 2009)
4. Internet and Web-based SCADA,
http://www.scadalink.com/technotesIP.htm (accessed January 2009)
5. What is API? A word definition from the Webopedia Computer Dictionary,
http://www.webopedia.com/TERM/A/API.html (accessed April 2010)
275
1 Introduction
The methodology of intrusion can be roughly classed as being either based on statistical profile or known patterns of attacks, called signatures or another classification, the
anomaly-based. In Anomaly-based, system detects computer intrusion and misuse by
monitoring system activity and classifying it as either normal or anomalous. Intrusion
detection is a type of security management system for computer and networks. An ID
gathers and analyzes information from various areas within a computer or a network
to identify possible security breaches, which includes both intrusion (attack from
outside the organization) and misuse (attack from within the organization)[1].The
latter case includes seemingly authorized users, such as masqueraders operating under
another users identification (ID) and password, or outside attackers who successfully
gained systems access but eluded detection of the method of entry. In this paper we
solely concentrate in the Statistical profile-based system. In the following section we
define further the Statistical profile-based system, Intrusion confinement thru isolation and the importance. We also present an isolation protocol in the file system.
Importance of intrusion confinement Statistical profile-based system compare relevant data by statistical or other methods to representative profiles of normal, expected
activity on the system or network [2]. Deviations indicate suspicious behavior. In
these systems, there are stringent requirements on not only reporting an intrusion
accurately (this is necessary because abnormal behavior is not always an intrusion)
but also detecting as many intrusions as possible (usually, not all intrusions can be
*
Corresponding author.
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 276281, 2010.
Springer-Verlag Berlin Heidelberg 2010
277
detected. Based on the assumption the more significant the deviation, the larger the
possibility that the behavior of a user is an intrusion, in order to ensure a high degree
of intrusion reporting, significant anomaly is required to raise a warning. Moreover,
when the anomaly of an intrusion is accumulated, detecting it can still cause a long
latency even if it is characterized by significant anomaly. As a result, substantial damage can be caused by an intruder within the latency.
2 Suspicious Behaviors
Suspicious behavior is the behavior that may have already caused some damage, or
may cause some damage later on, but was not reported as an intrusion when it happened. Suspicious behavior emerges in several situations:
In statistical profile-based detection:
(a) In order to get a high degree of soundness of intrusion reporting, some intrusions
characterized by gradual deviations may stay undetected. The corresponding behaviors can be reported as suspicious.
(b) For a detection with a long latency, the corresponding behavior can be reported as
suspicious in the middle of the latency.
(c) Legitimate behavior can be reported as suspicious if it is sufficiently unlike the
corresponding profile.
(2) In signature-based detection, partial matching of a signature can trigger a report of
suspicious behavior.
3 Profile-Based Detection
A statistical profile-based detection system, a user Ui accesses the system through
sessions. A session of Ui begins when Ui logs in and ends when Ui logs out. A behavior of Ui is a sequence of actions that can last across the boundaries of sessions. A
short term behavior of Ui is a behavior that is composed of a sequence of Uis most
recent actions. In contrast, a long term behavior of Ui is also a sequence of Uis most
recent actions but it is much longer than the short term behavior. We assume the intrusion detector is triggered in every m actions (or m audit records), that is, after m
new actions are executed, both the current short term behavior and long term behavior
of Ui will be upgraded and the deviation of the new short term behavior from the new
long term behavior will be computed. When a short term behavior is upgraded, its
oldest m actions will be discarded and the newest m actions will be added.
3.1 Signature-Based Detection
A sequence of signature-based events leading from an initial to a final compromised
state are specify [3].
Each event causes a state transition from one state to another state. We identify a
signature with length n, denoted sig(n), as sig(n) = s0E1s1En, where Ei is an event
and si is a state, and Ei causes the state transition from si-1 to si. For simplicity, intraevent conditions are not explicitly shown in sig(n), although they are usually part of a
signature.
278
A partial matching of a signature sig(n) is a sequence of events that matches a prefix of sig(n), A partial matching is not an intrusion, however, it can predict that an
intrusion specified by sig(n) may occur. The accuracy of the prediction of a partial
matching, denoted s0E1s1Emsm, can be measured by the following parameter: Pm,
the probability that the partial matching can lead to an intrusion later. Assume the
number of the behaviors that match the prefix is Np and the number of the intrusions
that match the prefix is Ni, then Pm = Ni / Np.
In signature-based detection, the set of actions that should be isolated is defined as
follows. Isolating suspicious behavior can surely confine damage in signature-based
detection because the behavior that is actually an intrusion will, with a high probability, be prevented from doing harm to the system.
In signature-based detection, a behavior is suspicious if it matches the prefix of a
signature but not the whole signature, and Pm of the prefix is greater than or equal to
a threshold that is determined by the SSO.
3.2 Application of Intrusion Confinement Support
In signature-based detection, the decision of whether to enforce intrusion confinement
on a known attack that is specified by a signature is dependent on the seriousness of
the damage that will be caused by the attack and the value of Pm for each prefix of
the signature.
In statistical profile-based detection, however, it can be tricky to make the decision
since degrading the requirement on Re usually can improve Rd, the SSO may want to
find a tradeoff between Rd and Re; thus, the cost of isolation would be avoided. However, a satisfactory tradeoff may not be achievable in some systems since the relationship between Rd and Re can dramatically differ from one system to another.
3.2.1 Architecture Support
The Policy Enforcement Manager enforces the access controls in accordance with the
system security policy on every access request [4]. We assume no data access can
bypass it. We further assume that users accesses will be audited in the audit trail.
The Intrusion Detection and Confinement Manager applies either statistical profilebased detection techniques or signature-based detection techniques, or both to identify
suspicious behavior as well as intrusions. The detection is typically processed based
on the information provided by the audit trail.
Architecture of an intrusion confinement system in information welfare perspective
is showed in Figure 1.
When a suspicious behavior is detected, the corresponding user is marked suspicious. At this point, first we need to deal with the effects that the user has already
made on the Main Data Version because these effects may have already caused some
damage. In signature-based detection systems, we can accept these effects because a
partial matching is not an intrusion. In statistical profile-based detection systems, if
the SSO does not think these effects can cause any serious damage, we can accept
these effects; if the SSO thinks these effects can cause intolerable damage, we can
isolate and move these effects from the main data version to a separate Suspicious
Data Version, which is created to isolate the user. The process of isolation may need
to roll back some trustworthy actions that are dependent on the suspicious actions. At
279
this point, we can apply another strategy that moves the effects of these suspicious
actions as well as the affected trustworthy actions to the suspicious data version.
Second, the Intrusion Detection and Confinement Manager notify the Policy Enforcement Manager to direct the subsequent suspicious actions of the user to the separate data version. Since we focus on the isolation itself, we can simply assume that
when a suspicious behavior starts to be isolated, no damage has been caused by the
behavior. Note that there can be several different suspicious users, e.g., S1,., Sn,
being isolated at the same time. Therefore, multiple suspicious data versions can exist
at the same time.
When a suspicious user turns out to be malicious, that is, his/her behavior has led to
an intrusion; the corresponding suspicious data version can be discarded to protect the
main data version from harm. On the other hand, when the user turns out to be innocent, the corresponding suspicious data version is merged into the main data version.
A suspicious behavior can be malicious in several ways:
(1) In signature-based detection, a complete matching can change a suspicious behavior to malicious.
(2) Some statistics of gradual anomaly, such as frequency and total number, can make
the SSO believe that a suspicious behavior is malicious.
(3) The SSO can find that a suspicious behavior is malicious based on some nontechnical evidences.
A suspicious behavior can be innocent in several ways:
(1) In signature-based detection, when no signatures can be matched, the behavior
proves innocent.
280
(2) The SSO can prove it to be innocent by some nontechnical evidence. For example,
the SSO can investigate the user directly.
(3) Some statistics of gradual anomaly can also make the SSO believe that a behavior
is innocent.
After the damage is assessed, the Reconfiguration Manager reconfigures the system to
allow access to continue in a degraded mode while repair is being done by the Damage Recovery Manager. In many situations damage assessment and recovery are coupled with each other closely. For example, recovery from damage can occur during
the process of identifying and assessing damage. Also, the system can be continuously reconfigured to reject accesses to newly identified, damaged data objects and to
allow access to newly recovered data objects. Interested readers can refer to [5] for
more details on damage confinement, damage assessment, system reconfiguration,
and damage recovery mechanisms in the database context.
5 Conclusion
In this paper we have shown that a second level in addition to access control intrusion
confinement can dramatically enhance the security especially integrity and availability of a system in many situation. It showed that intrusion confinement can effectively
resolve the conflicting design goals of an intrusion detection system by achieving
both a high rate of detection and a low rate of errors. Developing a more concrete
isolation protocols will further be studied in our future research.
Acknowledgment
This work was supported by the Security Engineering Research Center, granted by the
Korean Ministry of Knowledge Economy.
281
References
[1] Graubart, R., Schlipper, L., McCollum, C.: Defending database management systems
against information warfare attacks. Technical report, The MITRE Corporation (1996)
[2] Ammann, P., Jajodia, S., Liu, P.: Recovery from malicious transactions. Technical report,
George Mason University, Fairfax, VA,
http://www.isse.gmu.edu/~pliu/papers/dynamic.ps
[3] Jajodia, S., Liu, P., McCollum, C.: Applicationlevel isolation to cope with malicious database users. In: Proceedings of the 14th Annual Computer Security Application Conference,
Phoenix, AZ, pp. 7382 (1998)
[4] Northcutt, S.: Network Intrusion Detection. New Riders, Indianapolis (1999)
[5] Ilgun, K., Kemmerer, R., Porras, P.: State transition analysis: A rulebased intrusion detection approach. IEEE Transactions on Software Engineering 21(3), 181199 (1995)
Abstract. As an example of Control System, Supervisory Control and Data Acquisition systems can be relatively simple, such as one that monitors environmental conditions of a small office building, or incredibly complex, such as a
system that monitors all the activity in a nuclear power plant or the activity of a
municipal water system. SCADA systems are basically Process Control Systems, designed to automate systems such as traffic control, power grid management, waste processing etc. Connecting SCADA to the Internet can provide
a lot of advantages in terms of control, data viewing and generation. SCADA
infrastructures like electricity can also be a part of a Smart Grid. Connecting
SCADA to a public network can bring a lot of security issues. To answer the
security issues, a SCADA communication security solution is proposed.
Keywords: SCADA, Security Issues, Encryption, Crossed Crypto-scheme,
Smart Gird.
1 Introduction
Smart Grid refers to an improved electricity supply chain that runs from a major
power plant all the way inside your home. In short, there are thousands of power
plants throughout the United States that generate electricity using wind energy, nuclear energy, coal, hydro, natural gas, and a variety of other resources. These generating stations produce electricity at a certain electrical voltage. Smart Grid was built
when energy was relatively inexpensive. While minor upgrades have been made to
meet increasing demand, the grid still operates the way it did almost 100 years ago
energy flows over the grid from central power plants to consumers, and reliability is
ensured by maintaining excess capacity. Infrastructures like electricity which is controlled by SCADA can play a big role on Smart Grids.
SCADA systems are used to monitor and control a plant or equipment in industries
such as telecommunications, water and waste control, energy, oil and gas refining and
transportation. A SCADA system gathers information, such as where a leak on a pipeline has occurred, transfers the information back to a central site, alerting the home
station that the leak has occurred, carrying out necessary analysis and control, such as
*
Corresponding author.
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 282289, 2010.
Springer-Verlag Berlin Heidelberg 2010
283
determining if the leak is critical, and displaying the information in a logical and organized fashion. SCADA systems can be relatively simple, such as one that monitors
environmental conditions of a small office building, or incredibly complex, such as a
system that monitors all the activity in a nuclear power plant or the activity of a municipal water system.
On the Next parts of this paper, we discuss SCADA, the conventional setup and the
Smart Grid. Advantages which can be attained using the Smart Grid are also covered.
Security issues are being pointed out. We also suggest a security solution for a Web
based SCADA using symmetric key encryption.
284
285
information on a number of operator screens. Systems automatically control the actions and control the process of automation.
Conventionally, relay logic was used to control production and plant systems. With
the discovery of the CPU and other electronic devices, manufacturers incorporated
digital electronics into relay logic equipment. Programmable logic controllers or
PLC's are still the most widely used control systems in industry. As need to monitor
and control more devices in the plant grew, the PLCs were distributed and the systems
became more intelligent and smaller in size. PLCs (Programmable logic controllers)
and DCS (distributed control systems) are used as shown in Figure 1.
Fig. 1. Common SCADA Installation utilizing Remote Terminals (PLC/DCS, Sensors) and
Master Station connected using a fieldbus
4 Smart Grid
A smart grid includes an intelligent monitoring system that keeps track of all electricity flowing in the system. It also incorporates the use of superconductive transmission
lines for less power loss, as well as the capability of integrating alternative sources of
electricity such as solar and wind. When power is least expensive a smart grid could
turn on selected home appliances such as washing machines or factory processes that
can run at arbitrary hours. At peak times it could turn off selected appliances to reduce demand. Similar proposals include smart electric grid, smart power grid, intelligent grid (or intelligrid), FutureGrid, and the more modern intergrid and intragrid. In
principle, the smart grid is a simple upgrade of 20th century power grids which generally "broadcast" power from a few central power generators to a large number of
users, to instead be capable of routing power in more optimal ways to respond to a
very wide range of conditions, and to charge a premium to those that use energy at
peak hour.
286
The conditions, to which a smart grid, broadly stated, could respond, occur anywhere in the power generation, distribution and demand chain. Events may occur
generally in the environment (clouds blocking the sun and reducing the amount of
solar power, a very hot day), commercially in the power supply market (prices to
meet a high peak demand), locally on the distribution grid (MV transformer failure
requiring a temporary shutdown of one distribution line) or in the home (someone
leaving for work, putting various devices into hibernation, data ceasing to flow to an
IPTV), which motivate a change to power flow.
Latency of the data flow is a major concern, with some early smart meter architectures allowing actually as long as 24 hours delay in receiving the data, preventing any
possible reaction by either supplying or demanding devices. [3]
The Smart Grid is the application of modern information, communication, and
electronics technology to the electricity delivery infrastructure as shown in figure 2.
The earliest, and still largest, example of a smart grid is the Italian system installed
by Enel S.p.A. of Italy. Completed in 2005, the Telegestore project was highly unusual in the utility world because the company designed and manufactured their own
meters, acted as their own system integrator, and developed their own system software. The Telegestore project is widely regarded as the first commercial scale use of
smart grid technology to the home, and delivers annual savings of 500 million euro at
a project cost of 2.1 billion euro. [5]
287
Accommodates a wide variety of generation options central and distributed, intermittent and dispatchable.
Empowers the consumer interconnects with energy management systems
in smart buildings to enable customers to manage their energy use and reduce their energy costs.
Self-healing anticipates and instantly responds to system problems in order
to avoid or mitigate power outages and power quality problems.
Tolerant of attack mitigates and stands resilient to physical and cyber attacks
Provides power quality needed by 21st century users
Fully enables competitive energy markets real-time information, lower
transaction costs, available to everyone
Optimizes assets uses IT and monitoring to continually optimize its capital
assets while minimizing operations and maintenance costs more throughput per $ invested.
288
289
The cipher text of the message digest is decrypted using ECC technique to obtain the
message digest sent by the SCADA Master. This value is compared with the computed message digest. If both of them are equal, the message is accepted otherwise it
is rejected. You can see this scenario in figure 5.
7 Conclusion
Smart Grid builds on many of the technologies already used by electric utilities but
adds communication and control capabilities that will optimize the operation of the
entire electrical grid. Smart Grid is also positioned to take advantage of new technologies, such as plug-in hybrid electric vehicles, various forms of distributed generation, solar energy, smart metering, lighting management systems, distribution automation, and many more.It is easy to observe that SCADA technology holds a lot of
promise for the future.
The economic and performance advantages of this type of system are definitely attractive. The security of any future Smart Grid is dependent on successfully addressing the cyber security issues associated with the nations current power grid. The
implementation of Smart Grid will include the deployment of many new technologies
and multiple communication infrastructures. In this paper, we propose the integration
of the Crossed crypto-scheme to the SCADA system in Smart Grid.
Acknowledgement. This work was supported by the Security Engineering Research
Center, granted by the Korean Ministry of Knowledge Economy.
References
1. Hildick-Smith, A.: Security for Critical Infrastructure SCADA Systems (2005)
2. Bailey, D., Wright, E.: Practical SCADA for Industry (2003)
3. http://earth2tech.com/2008/05/01/silver-springs-the-cisco-ofsmart-grid/ (accessed May 2010)
4. http://earth2tech.com/2009/05/20/utility-perspective-whypartner-with-google-powermeter/ (accessed May 2010)
5. National Energy Technology Laboratory (2007-08) (pdf). NETL Modern Grid Initiative
Powering Our 21st-Century Economy. United States Department of Energy Office of
Electricity Delivery and Energy Reliability. p. 17, http://www.netl.doe.gov/
smartgrid/referenceshelf/whitepapers/
Modern%20Grid%20Benefits_Final_v1_0.pdf (accessed May 2010)
6. Balitanas, M., Robles, R.J., Kim, N., Kim, T.: Crossed Crypto-scheme in WPA PSK Mode.
In: Proceedings of BLISS 2009, Edinburgh, GB, IEEE CS, Los Alamitos (August 2009)
7. Federal Information Processing Standards Publication 197, Announcing the Advanced Encryption Standard (AES) (2001),
http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
(accessed January 2009)
Abstract. Secure storage systems should consider the integrity and authentication of long-term stored information. When information is transferred through
communication channels, different types of digital information can be represented, such as documents, images, and database tables. The authenticity of such
information must be verified, especially when it is transferred through communication channels. Authentication verification techniques are used to verify that the
information in an archive is authentic and has not been intentionally or maliciously altered. In addition to detecting malicious attacks, verifying the integrity
also identifies data corruption. The purpose of Message Authentication Code
(MAC) is to authenticate messages, where MAC algorithms are keyed hash functions. In most cases, MAC techniques use iterated hash functions, and these
techniques are called iterated MACs. Such techniques usually use a MAC key as
an input to the compression function, and this key is involved in the compression
function, f, at every stage. Modification detection codes (MDCs) are un-keyed
hash functions, and are widely used by authentication techniques such as MD4,
MD5, SHA-1, and RIPEMD-160. There have been new attacks on hash functions such as MD5 and SHA-1, which requires the introduction of more secure
hash functions. In this paper, we introduce a new MAC methodology that uses
an input MAC key in the compression function, to change the order of the message words and shifting operation in the compression function. The new methodology can be used in conjunction with a wide range of modification detection
code techniques. Using the SHA-1 algorithm as a model, a new (SHA-1)-MAC
algorithm is presented. The (SHA-1)-MAC algorithm uses the MAC key to build
the hash functions by defining the order for accessing source words and defining
the number of bit positions for circular left shifts.
Keywords: Secure Hash Algorithm, Hash-Based Message Authentication
Codes.
1 Introduction
The fast growth of open systems and the adoption of the Internet in daily business
life have opened up many opportunities. A new set of challenges has also arisen and
T.-h. Kim et al. (Eds.): SecTech/DRBC 2010, CCIS 122, pp. 290298, 2010.
Springer-Verlag Berlin Heidelberg 2010
291
continues to tax this new way of computing. These include the ability of the open
systems to provide security and data integrity [1]. Computer systems typically require
authentication. At the same time, attackers try to penetrate information systems, and
their goal is often to compromise the authentication mechanism protecting the system
from unauthorized access [2]. Usually, integrity comes in conjunction with the authentication algorithm. Integrity refers to assuring that the receiver receives what the
sender has transmitted, with no accidental or intentional unauthorized modification
having affected the transmitted data [3], [5], [6]. Many modification detection code
algorithms (MDC) [3], [7], [8] are block-cipher based. Those with relatively short
MAC bit lengths (e.g., 32-bits for MAA [8]) or short keys (e.g., 56 bits for MACs
based on DES-CBC [3]) may still offer adequate security. Many iterated MACs can
be described as iterated hash functions. In this case, the MAC key is generally part of
the output transformation; it may also be an input to the compression function in the
first iteration, and be fed to the compression function at every stage. An upper bound
on the security of MACs should be considered, based on an iterated compression
function, which has an internal chaining variable with n-bits, and is deterministic (i.e.,
the m-bit result is fully determined by the message). Most of the commonly used
MAC algorithms are based on block ciphers that make use of cipher-block-chaining
(CBC) [1], [4], [7], [6]. A common suggestion is to construct a MAC algorithm from
a modification detection code (MDC) algorithm, by simply including a secret key, k,
as part of the MDC input [3], [9], [10], [6]. A more conservative approach for building a MAC from an MDC is to make the MAC compression function depend on k,
implying that the secret key be involved in all of the intervening iterations. This provides additional protection in a case where a weakness in the underlying hash function
becomes known. Such a technique is employed using MD5. It provides performance
slightly slower to that of MD5. Alternatively, a MAC technique can be based on cyclic redundancy codes [8], [11]. There have been new attacks on hash functions such
as MD5 and SHA-1, which require the introduction of more secure hash functions. In
[10], an attack on MD5 was introduced that finds collisions of MD5 in about 15 minutes to an hour of computation time. The attack is a differential attack, where XORs
are not used as a measure of difference. Instead, it uses modular integer subtraction as
the measure. The same attack could find collisions of MD4 in less than a second. In
[12], the attack was applied to SHA-0 and all variants of SHA-1. The attack could
find real collisions of SHA-0 in less than 239 hash operations. The attack was implemented on SHA-1, and collisions could be found in less than 233 hash operations. In
this paper, we adopt the NMACA technique that was introduced in [4]. NMACA uses
an input MAC key in the compression function, to change the order of the message
words and shifting operation in the compression function. The new methodology can
be used in conjunction with a wide range of modification detection code techniques.
Using the SHA-1 algorithm as a model, a new NMACA (SHA-1) algorithm is presented. The (SHA-1) NMACA algorithm uses the MAC key in building the hash
functions by defining the order for accessing source words and defining the number of
bit positions for circular left shifts. The rest of this paper is organized as follows: Section 2 provides a review of the literature in the area of message authentication coding,
Section 3 describes the new methodology, Section 4 presents the experimental results,
and finally our paper is concluded in Section 5.
292
R. Alosaimy et al.
2 Literature Review
There are two major approaches to implementing authentication/integrity mechanisms,
the use of a digital signature and the use of a Message Authentication Code [13], [14],
[6]. The digital signature approach uses two keys, a public and private key. A sender
signs a message digitally by computing a hash function (or checksum) over the data,
and then encrypts the hash function value using the private key. The encrypted hashing
value is sent to the receiver accompanied by the data. The receiver verifies the authenticity of the received data by recalculating the hash value and decrypting the transmitted hashing value using the public key. The two hash values are then compared. If they
match, the message is authentic and came from the claimed sender. In a Message Authentication Code (MAC), a shared secret key is used instead of the private key. There
are several ways to provide authentication/integrity by using the secret key [15, 16].
The main two are Hash-Based Message Authentication Codes (HMAC) and Encryption-Based Message Authentication Codes. In HMAC, a strong hash function algorithm such as MD5 or SHA1 is used to create a hashing value over the data and the
embedded secret key. The sender concatenates a symmetric key with the message using any of the different embedding strategies. A hashing algorithm generates a MAC
value, and the MAC value is appended to the message. The sender sends the message
with the attached MAC value to the receiver. At the receiver side, the same hash function is applied to the concatenated data and key to again generate the MAC value. The
receiver compares the two MAC values. If they are the same, the message has not been
modified [17]. The structure of the HMAC algorithm is illustrated in Fig. 1.
Tx
Rx
Message
Message
Message
Hashing Algorithm
MAC
Value
Hashing Algorithm
MAC Value 2
MAC Value 1
?
Fig. 1. Hash-Based Message Authentication Code Structure
293
The new NMACA-(SHA-1) involves the key, K, in the different steps of the SHA-1
algorithm. K is used to determine the access order for message words and to determine the shift amounts in the distinct rounds. In the proposed algorithm, only two
changes are done to the SHA-1 algorithm. Both algorithms use the same initial chaining values (IVs), apply the same hash functions, and use the same padding process to
adjust the length of the transmitted message. The two differences between the SHA-1
and the NMACA-(SHA-1) techniques are the order of the message words, and the
order of the values that are used in circular left shifts on the four auxiliary functions.
The 160-bit secret key, K, is divided into two parts, each of 80 bits. The first 80 bits
are used to rearrange the order of the message words. These 80-bits of K are divided
into four divisions, each with 20 bits. Each of these 20-bit divisions of K is used to
rearrange a block of 20 words (32-bit words) of the input message. Starting from a
vector of a predefined order of accessing each of 20 words, the order of such a vector
At
s[ j ]
At
Bt
s 30
Bt
Ct
Dt
Et
Ct
ft
Dt
Et
X z [ j]
Kt
294
R. Alosaimy et al.
is changed according to the related part in K. The other 80 bits of K are used to define
the numbers of circular left shifts used in the four auxiliary functions. Using the same
key related permutations, the order of the circular shifts in each of the four auxiliary
functions is defined by using one of the 20-bit divisions of K.
The proposed algorithm is as fast and as robust as SHA-1. The algorithm is depicted below.
INPUT: bit string x of arbitrary bit length b 0 and 160-bit key K.
OUTPUT: 160-bit hash-code of x.
Define five 32-bit initial chaining values (IVs):
h1 = 0 x67452301
h2 = 0 xefcdab89
h4 = 0 x98badcfe
h5 = 0 xc3d 2e1 f 0
h3 = 0 x98badcfe
Each takes as input three 32-bit words and produces, as an output, a 32-bit word.
They apply the logical operators and, or, not, and xor on the input bits.
f ( B, C , D) = ( B C ) ( B D ) g ( B, C , D) = ( B C D )
h( B, C , D ) = ( B C ) ( B D ) ( C D ) i ( B, C , D ) = ( B C D )
y1 = 0 x5a827999
0 Oi 19
nd
20 Oi 39
rd
40 Oi 59
60 Oi 79
Define the number of bit positions for left shifts (rotates) by using the secret key K:
z [ 0 :: 79] = Permutation Ps of the last 80 bits of K, Ps :{0,1,K ,79} {Oi 0O 79 }
i
3.2 Preprocessing
The message padding is provided to make a final padded message a multiple of 512
bits, as follows. First append a 1 followed by as many 0s as necessary to make it
64 bits short of a multiple of 512-bits. Finally, a 64-bit integer is appended to the end
of the zero appended messages to produce a final padded message of length n 512 bits. The 64-bit integer I represents the length of the original message. The padded
message is then processed by the NMACA-(SHA-1) as n 512 -bit blocks. The formatted input consists of 16 m 32-bit words: x0 , x1 ,K x16 m 1 .
295
Initialize: ( h1 , h2 , h3 , h4 , h5 ) ( H1 , H 2 , H 3 , H 4 , H 5 ) .
Processing
For each i0 m 1 copy the ith block of the sixteen 32-bit words into temporary storage:
X [ j ] = x16i + j
0 j 15
Xj
(( X
j 3
X j 8 X j 14 X j 16 ) 1
H1 + A, H 2 + B, H 3 + C ,
H 4 + D, H 5 + E
( H1 , H 2 , H 3 , H 4 , H 5 )
Completion
The final MAC value is the concatenation: H1 P H 2 P H 3 P H 4 (with the first and last
bytes the high- and low-order bytes of H 1, H 5 , respectively).
4 Security of NMACA-(SHA-1)
Four rounds are being excited with different orders of values.
A different message word is provided.
296
R. Alosaimy et al.
5 Experimental Results
In this section, experimental results are used to show the performance of SHA-1 and
NMACA-(SHA-1) in terms of their speed and avalanche effect. We performed our
experiments on a 2.6-GHz machine, with 4 GB of RAM and running Microsoft Windows XP professional. Messages of up to 1-Mbytes were used. From the results obtained when comparing the speed performance of the NMACA-(SHA-1) algorithm
297
and SHA-1, we found that SHA-1 is faster but the NMACA-(SHA-1) algorithm provides greater security, making it harder to crack. Actually, such a performance was
predictable since the only difference between SHA-1 and NMACA-(SHA-1) is the
part where the word order and circular left shifts are defined in the initial phase of
NMACA-(SHA-1). In addition, the experimental results showed that by using different bit changes in the MAC key, K, the confusion effect of the proposed approach was
almost 50%, which is considered an indicator of the quality of the hash function used.
These results were the same as the SHA-1 approach, which means the use of
NMACA with SHA-1 did not degrade the quality of the SHA-1 approach. The same
applies to the diffusion effect.
6 Conclusions
We have proposed a new technique that is based on the Message Authentication Code
approach NMACA. In NMACA, a 160-bit secret key, K, is used to determine the access order for message words and the shift amounts in distinct steps in each round. In
this paper, we adopted the SHA-1 technique to demonstrate the proposed NMACA
approach. The new technique is called the NMACA-(SHA-1) algorithm. In the experiments performed, the proposed algorithm, NMACA-(SHA-1), was compared to
SHA-1. The results showed that the speed performances were comparable. This was
because the only extra overhead needed for NMACA-(SHA-1) is the processing time
for defining the reordering of the message words and circular left shifts. The diffusion
and confusion effects were studied by studying the effect of changes in the message
bits (diffusion) and MAC key bits (confusion). These two measures illustrate the avalanche effect of the underlying hash function. The experimental results showed that,
by using different bit changes in the input message and MAC key K, the confusion
and diffusion effects of the proposed approach were almost 50%, which is considered
an indicator of the quality of the hash function used. This proposed technique could
be useful in applications presented in [19].
298
R. Alosaimy et al.
References
1. Little, D., Skip, B., Elhilali, O.: Digital Data Integrity The Evolution from Passive Protection to Active Management. John Wiley & Sons Ltd., England (2007)
2. Todorov, D.: Mechanics of User Identification and Authentication Fundamentals of Identity Management
3. Coskun, B., Memon, N.: Confusion/Diffusion Capabilities of Some Robust Hash Functions. In: Conference on Information Sciences and Systems, pp. 2224 (2006)
4. Alghathbar, K., Hafez, A., Muhaya, F.B., Abdalla, H.M.: NMACA: A Novel Methodology
for Message Authentication Code Algorithms. In: 8th WSEAS Int. Conf. on Telecommunications and Informatics (TELE-INFO 2009), Turkey (2009)
5. Wright, C., Spillane, R., Sivathanu, G., Zadok, E.: Extending ACID Semantics to the File
System. ACM Transactions on Storage (2007)
6. The Keyed-Hash Message Authentication Code (HMAC). In: The Federal Information
Processing Standards Publication 198 (2002)
7. Haber, S., Kamat, P.: A Content Integrity Service For Long-Term Digital Archives. In:
The IS&T Archiving 2006 Conference, Canada (2006)
8. Malone, D., Sullivan, W.: Guesswork is not a substitute for entropy. In: Proceedings of the
Information Technology and Telecommunications Conference (2005)
9. Fridrich, J.: Robust bit extraction from images. In: ICMCS 1999, Italy (1999)
10. Wang, X., Yin, Y., Yu, L.H.: Finding collisions in the full SHA-1. In: Shoup, V. (ed.)
CRYPTO 2005. LNCS, vol. 3621, pp. 1736. Springer, Heidelberg (2005)
11. Zahur, Y., Yang, T.A.: Wireless LAN Security and Laboratory Designs. Journal of Computing Sciences in Colleges 19, 4460 (2004)
12. Por, L.Y., Lim, X.T., Su, M.T., Kianoush, F.: The Design and Implementation of Background Pass-Go Scheme Towards Security Threats. Wseas Transactions on Information
Science and Applications 5(6), 943952 (2008)
13. Preneel, B., Oorschot, P.: MDx-MAC and building fast MACs from hash functions. In:
Coppersmith, D. (ed.) CRYPTO 1995. LNCS, vol. 963, pp. 114. Springer, Heidelberg
(1995)
14. Venkatesan, R., Koon, S., Jakubowski, M., Moulin, P.: Robust image hashing. In: Proc.
IEEE Int. Conf. Image Processing (2000)
15. Shannon, C.E.: Communication theory of secrecy systems. Bell System Technical Journal 28, 656715 (1949)
16. Stevens, M., Lenstra, A., de Weger, B.: Chosen-prefix collisions for MD5 and colliding
X.509 certificates for different identities. In: Naor, M. (ed.) EUROCRYPT 2007. LNCS,
vol. 4515, pp. 122. Springer, Heidelberg (2007)
17. Certified Information Systems Security Professional (CISSP) Book
18. Wang, X., Yu, H.: How to break MD5 and other hash functions. In: Cramer, R. (ed.)
EUROCRYPT 2005. LNCS, vol. 3494, pp. 1935. Springer, Heidelberg (2005)
19. Eldefrawy, M.H., Khan, M.K., Alghathbar, K., Cho, E.-S.: Broadcast Authentication for
Wireless Sensor Networks Using Nested Hashing and the Chinese Remainder Theorem.
Sensors 10(9), 86838695 (2010)
Author Index
276
T. 231
179
1
104, 114
161, 290
74
18
126
Gafurov, Davrondzhon
Goi, Bok-Min 149
Gupta, Phalguni 187
94
Kang, Ju-Sung 39
Khan, Bilal 8
Khan, Mohammad Ibrahim 104
Khan, Muhammad Khurram 8, 161, 236
Khan, Zeeshan Sha 224
Kim, Cheol-Hong 104, 114
Kim, Haeng-kon 269
Kim, Inhyuk 134
Kim, Intae 74
Kim, Jong-Myon 104, 114
Kim, Jung-nye 68
Kim, Taehyoung 134
Kim, Tai-hoon 142, 269, 276, 282
Kisku, Dakshina Ranjan 187
Ko, Hoon 171
Lee, Junghyun
Jung, Jae-il
1, 29, 179
244
197
300
Author Index
197
Wang, Ting-Hsuan
47
126
Yamamoto, Kenji 57
Yamauchi, Toshihiro 57, 84
Yang, Bian 1
Yau, Wei-Chuen 149
Yi, Ok-Yeon 39
Yim, Hong-bin 94