Advanced Computing, Networking and Informatics - Volume 1
Advanced Computing, Networking and Informatics - Volume 1
SMART INNOVATION,
Amit Konar
Aruna Chakraborty
Editors
Advanced Computing,
Networking and
Informatics – Volume 1
Advanced Computing and Informatics
Proceedings of the Second International
Conference on Advanced Computing,
Networking and Informatics
(ICACNI-2014)
13
Smart Innovation, Systems and Technologies
Volume 27
Series editors
Robert J. Howlett, KES International, Shoreham-by-Sea, UK
e-mail: [email protected]
Lakhmi C. Jain, University of Canberra, Canberra, Australia
e-mail: [email protected]
The Smart Innovation, Systems and Technologies book series encompasses the topics
of knowledge, intelligence, innovation and sustainability. The aim of the series is to
make available a platform for the publication of books on all aspects of single and
multi-disciplinary research on these themes in order to make the latest results available
in a readily-accessible form. Volumes on interdisciplinary research combining two or
more of these areas is particularly sought.
The series covers systems and paradigms that employ knowledge and intelligence in
a broad sense. Its scope is systems having embedded knowledge and intelligence, which
may be applied to the solution of world problems in industry, the environment and the
community. It also focusses on the knowledge-transfer methodologies and innovation
strategies employed to make this happen effectively. The combination of intelligent
systems tools and a broad range of applications introduces a need for a synergy of dis-
ciplines from science, technology, business and the humanities. The series will include
conference proceedings, edited collections, monographs, handbooks, reference books,
and other relevant types of book in areas of science and technology where smart systems
and technologies can offer innovative solutions.
High quality content is an essential feature for all book proposals accepted for the
series. It is expected that editors of all accepted volumes will ensure that contributions
are subjected to an appropriate level of reviewing process and adhere to KES quality
principles.
Malay Kumar Kundu · Durga Prasad Mohapatra
Amit Konar · Aruna Chakraborty
Editors
Advanced Computing,
Networking and
Informatics – Volume 1
Advanced Computing and Informatics
Proceedings of the Second International
Conference on Advanced Computing,
Networking and Informatics
(ICACNI-2014)
ABC
Editors
Malay Kumar Kundu Amit Konar
Machine Intelligence Unit Dept. of Electronics and
Indian Statistical Institute Tele-Communication Engineering
Kolkata Artificial Intelligence Laboratory
India Jadavpur University
Kolkata
Durga Prasad Mohapatra India
Dept. of Computer Science and Engineering
National Institute of Technology Rourkela Aruna Chakraborty
Rourkela Dept. of Computer Science and Engineering
India St. Thomas’ College of Engineering
& Technology
Kidderpore
India
The conference committee, editors, and the publisher deserve congratulations for
organizing the event (ICACNI-2014) which is very timely, and bringing out the archival
volumes nicely as its output.
Thanks are also due to the various Universities/Institutes for their active support towards
this endeavor, and lastly Springer-Verlag for publishing the proceedings under their
prestigious Smart Innovation, Systems and Technologies (SIST) series.
Wish the participants an enjoyable and productive stay in Kolkata.
The twenty first century has witnessed a paradigm shift in three major disciplines of
knowledge: 1) Advanced/Innovative computing ii) Networking and wireless Commu-
nications and iii) informatics. While the first two are complete in themselves by their
titles, the last one covers several sub-disciplines involving geo-, bio-, medical and cog-
nitive informatics among many others. Apparently, the above three disciplines of knowl-
edge are complementary and mutually exclusive but their convergence is observed in
many real world applications, encompassing cyber-security, internet banking, health-
care, sensor networks, cognitive radio, pervasive computing and many others.
The International Conference on Advanced Computing, Networking and Infor-
matics (ICACNI) is aimed at examining the convergence of the above three modern
disciplines through interactions among three groups of people. The first group com-
prises leading international researchers, who have established themselves in one of
the above three thrust areas. The plenary, the keynote lecture and the invited talks are
organized to disseminate the knowledge of these academic experts among young re-
searchers/practitioners of the respective domain. The invited talks are also expected to
inspire young researchers to initiate/orient their research in respective fields. The second
group of people comprises Ph.D./research students, working in the cross-disciplinary
areas, who might be benefited from the first group and at the same time may help cre-
ating interest in the cross-disciplinary research areas among the academic community,
including young teachers and practitioners. Lastly, the group comprising undergradu-
ate and master students would be able to test the feasibility of their research through
feedback of their oral presentations.
ICACNI is just passing its second birthday. Since its inception, it has attracted
a wide audience. This year, for example, the program committee of ICACNI received
as many as 646 papers. The acceptance rate is intentionally kept very low to ensure a
quality publication by Springer. This year, the program committee accepted only 148
papers from these 646 submitted papers. An accepted paper has essentially received
very good recommendation by at least two experts in the respective field.
To maintain a high standard of ICACNI, researchers from top international
research laboratories/universities have been included in both the advisory committee
and the program committee. The presence of these great personalities has helped the
X Preface
conference to develop its publicity during its infancy and promote it quality through an
academic exchange among top researchers and scientific communities.
The conference includes one plenary session, one keynote address and four
invited speech sessions. It also includes 3 special sessions and 21 general sessions (al-
together 24 sessions) with a structure of 4 parallel sessions over 3 days. To maintain
good question-answering sessions and highlight new research results arriving from the
sessions, we selected subject experts from specialized domains as session chairs for
the conference. ICACNI also involved several persons to nicely organize registration,
take care of finance, hospitality of the authors/audience and other supports. To have a
closer interaction among the people of the organizing committee, all members of the
organizing committee have been selected from St. Thomas’ College of Engineering and
Technology.
The papers that passed the screening process by at least two reviewers, well-
formatted and nicely organized have been considered for publication in the Smart In-
novations Systems Technology (SIST) series of Springer. The hard copy proceedings
include two volumes, where the first volume is named as Advanced Computing and In-
formatics and the second volume is named as Wireless Networks and Security. The two
volumes together contain 148 papers of around eight pages each (in Springer LNCS
format) and thus the proceedings is expected to have an approximate length of 1184
pages.
The editors gratefully acknowledge the contribution of the authors and the en-
tire program committee without whose active support the proceedings could hardly
attain the present standards. They would like to thank the keynote speaker, the plenary
speaker, the invited speakers and also the invited session chairs, the organizing chair
along with the organizing committee and other delegates for extending their support in
various forms to ICACNI-2014. The editors express their deep gratitude to the Honorary
General Chair, the General Chair, the Advisory Chair and the Advisory board members
for their help and support to ICACNI-2014. The editors are obliged to Prof. Lakhmi C.
Jain, the academic series editor of the SIST series, Springer and Dr. Thomas Ditzinger,
Senior Editor, Springer, Heidelberg for extending their co-operation in publishing the
proceeding in the prestigious SIST series of Springer. They also like to mention the
hard efforts of Mr. Indranil Dutta of the Machine Intelligence Unit of ISI Kolkata
for the editorial support. The editors also acknowledge the technical support they re-
ceived from the students of ISI, Kolkata and Jadavpur University and also the faculty of
NIT Rourkela and St. Thomas’ College of Engineering and Technology without which
the work could not be completed in right time. Lastly, the editors thank Dr. Sailesh
Mukhopadhyay, Prof. Gautam Banerjee and Dr. Subir Chowdhury of St. Thomas’ Col-
lege of Engineering and Technology for their support all the way long to make this
conference a success.
Kolkata Malay Kumar Kundu
April 14, 2014 Durga Prasad Mohapatra
Amit Konar
Aruna Chakraborty
Organization
Advisory Chair
Sankar K. Pal Distinguished Scientist, Indian Statistical Institute,
India
Chief Patron
Sailesh Mukhopadhyay St. Thomas’ College of Engineering and
Technology, Kolkata, India
Patron
Gautam Banerjea St. Thomas’ College of Engineering and
Technology, Kolkata, India
Honorary General Chair
Dwijesh Dutta Majumder Professor Emeritus, Indian Statistical Institute,
Kolkata, India
Institute of Cybernetics Systems and Information
Technology, India
Director, Governing Board, World Organization of
Systems and Cybernetics (WOSC), Paris
General Chairs
Rajib Sarkar Central Institute of Technology Raipur, India
Mrithunjoy Bhattacharyya St. Thomas’ College of Engineering and
Technology, Kolkata, India
Programme Chairs
Malay Kumar Kundu Indian Statistical Institute, Kolkata, India
Amit Konar Jadavpur University, Kolkata, India
Aruna Chakraborty St. Thomas’ College of Engineering and
Technology, Kolkata, India
Programme Co-chairs
Asit Kumar Das Bengal Engineering and Science University,
Kolkata, India
Ramjeevan Singh Thakur Maulana Azad National Institute of Technology,
India
Umesh A. Deshpande Sardar Vallabhbhai National Institute of
Technology, India
Organization XIII
Organizing Chairs
Ashok K. Turuk National Institute of Technology Rourkela,
Rourkela, India
Rabindranath Ghosh St. Thomas’ College of Engineering and
Technology, Kolkata, India
Technical Track Chairs
Joydeb Roychowdhury Central Mechanical Engineering Research Institute,
India
Korra Sathyababu National Institute of Technology Rourkela, India
Manmath Narayan Sahoo National Institute of Technology Rourkela, India
Publication Chair
Sambit Bakshi National Institute of Technology Rourkela, India
XIV Organization
Publicity Chair
Mohammad Ayoub Khan Center for Development of Advanced Computing,
India
Organizing Committee
Amit Kr. Siromoni St. Thomas’ College of Engineering and
Technology, Kolkata, India
Anindita Ganguly St. Thomas’ College of Engineering and
Technology, Kolkata, India
Arindam Chakravorty St. Thomas’ College of Engineering and
Technology, Kolkata, India
Dipak Kumar Kole St. Thomas’ College of Engineering and
Technology, Kolkata, India
Prasanta Kumar Sen St. Thomas’ College of Engineering and
Technology, Kolkata, India
Ramanath Datta St. Thomas’ College of Engineering and
Technology, Kolkata, India
Subarna Bhattacharya St. Thomas’ College of Engineering and
Technology, Kolkata, India
Supriya Sengupta St. Thomas’ College of Engineering and
Technology, Kolkata, India
Program Committee
Vinay. A Peoples Education Society Institute of Technology,
Bangalore, India
Chunyu Ai University of South Carolina Upstate, Spartanburg,
USA
Rashid Ali Aligarh Muslim University, Aligarh, India
C.M. Ananda National Aerospace Laboratories, Bangalore, India
Soumen Bag International Institution of Information Technology,
Bhubaneswar, India
Sanghamitra Bandyopadhyay Indian Statistical Institute, Kolkata, India
Punam Bedi University of Delhi, Delhi, India
Dinabandhu Bhandari Heritage Institute of Technology, Kolkata, India
Paritosh Bhattacharya National Institute of Technology, Agartala, India
Malay Bhattacharyya University of Kalyani, Kolkata, India
Sambhunath Biswas Indian Statistical Institute, Kolkata, India
Darko Brodic University of Belgrade, Bor, Serbia
Sasthi C. Ghosh Indian Statistical Institute, Kolkata, India
R.C. Hansdah Indian Institute of Science, Bangalore, India
Nabendu Chaki University of Calcutta, Kolkata, India
Goutam Chakraborty Iwate Prefectural University, Takizawa, Japan
Organization XV
Additional Reviewers
A.M., Chandrashekhar Datta, Shreyasi Misra, Anuranjan
Acharya, Anal De, Debashis Mohanty, Ram
Agarwal, Shalabh Dhabal, Supriya Maitra, Subhamoy
B.S., Mahanand Dhara, Bibhas Chandra Mondal, Jaydeb
Bandyopadhyay, Oishila Duvvuru, Rajesh Mondal, Tapabrata
Barpanda, Soubhagya Gaidhane, Vilas Mukherjee, Nabanita
Sankar Ganguly, Anindita Mukhopadhyay, Debajyoti
Basu, Srinka Garg, Akhil Mukhopadhyay,
Battula, Ramesh Babu Ghosh Dastidar, Jayati Debapriyay
Bhattacharjee, Sourodeep Ghosh, Arka Munir, Kashif
Bhattacharjee, Subarna Ghosh, Lidia Nasim Hazarika,
Bhattacharya, Indrajit Ghosh, Madhumala Saharriyar Zia
Bhattacharya, Nilanjana Ghosh, Partha Nasipuri, Mita
Bhattacharyya, Saugat Ghosh, Rabindranath Neogy, Sarmistha
Bhowmik, Deepayan Ghosh, Soumyadeep Pal, Monalisa
Biswal, Pradyut Ghoshal, Ranjit Pal, Tamaltaru
Biswas, Rajib Goyal, Lalit Palodhi, Kanik
Bose, Subrata Gupta, Partha Sarathi Panigrahi, Ranjit
Chakrabarti, Prasun Gupta, Savita Pati, Soumen Kumar
Chakraborty, Debashis Halder, Amiya Patil, Hemprasad
Chakraborty, Jayasree Halder, Santanu Patra, Braja Gopal
Chandra, Helen Herrera Lara, Roberto Pattanayak, Sandhya
Chandra, Mahesh Jaganathan, Ramkumar Paul, Amit
Chatterjee, Aditi Kakarla, Jagadeesh Paul, Partha Sarathi
Chatterjee, Sujoy Kar, Mahesh Phadikar, Amit
Chowdhury, Archana Kar, Reshma Phadikar, Santanu
Chowdhury, Manish Khasnobish, Anwesha Poddar, Soumyajit
Dalai, Asish Kole, Dipak Kumar Prakash, Neeraj
Darbari, Manuj Kule, Malay Rakshit, Pratyusha
Das, Asit Kumar Kumar, Raghvendra Raman, Rahul
Das, Debaprasad Lanka, Swathi Ray, Sumanta
Das, Nachiketa Maruthi, Padmaja Roy, Pranab
Das, Sudeb Mishra, Dheerendra Roy, Souvik
Datta, Biswajita Mishra, Manu Roy, Swapnoneel
XVIII Organization
Machine Learning
Machine Learning Based Shape Classification Using Tactile Sensor
Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Dennis Babu, Sourodeep Bhattacharjee, Irin Bandyopadhyaya,
Joydeb Roychowdhury
XX Contents
Image Analysis
Modified Majority Voting Algorithm towards Creating Reference Image
for Binarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Ayan Dey, Soharab Hossain Shaikh, Khalid Saeed, Nabendu Chaki
Multiple People Tracking Using Moment Based Approach . . . . . . . . . . . . . . . 229
Sachin Kansal
Wavelets-Based Clustering Techniques for Efficient Color Image
Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Paritosh Bhattacharya, Ankur Biswas, Santi Prasad Maity
An Approach of Optimizing Singular Value of YCbCr Color Space with
q-Gaussian Function in Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Abhisek Paul, Paritosh Bhattacharya, Santi Prasad Maity
Motion Tracking of Humans under Occlusion Using Blobs . . . . . . . . . . . . . . . 251
M. Sivarathinabala, S. Abirami
Efficient Lifting Scheme Based Super Resolution Image Reconstruction
Using Low Resolution Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Sanwta Ram Dogiwal, Y.S. Shishodia, Abhay Upadhyaya
Improved Chan-Vese Image Segmentation Model Using Delta-Bar-Delta
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Devraj Mandal, Amitava Chatterjee, Madhubanti Maitra
XXII Contents
Document Analysis
A Novel Semantic Similarity Based Technique for Computer Assisted
Automatic Evaluation of Textual Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
Udit Kr. Chakraborty, Samir Roy, Sankhayan Choudhury
Representative Based Document Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Arko Banerjee, Arun K. Pujari
A New Parallel Thinning Algorithm with Stroke Correction for Odia
Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
Arun K. Pujari, Chandana Mitra, Sagarika Mishra
Evaluation of Collaborative Filtering Based on Tagging with Diffusion
Similarity Using Gradual Decay Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
Latha Banda, Kamal Kanth Bharadwaj
Rule Based Schwa Deletion Algorithm for Text to Speech Synthesis in
Hindi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Shikha Kabra, Ritika Agarwal, Neha Yadav
Unsupervised Word Sense Disambiguation for Automatic Essay Scoring . . . 437
Prema Nedungadi, Harsha Raj
Ontological Analysis
Automatic Resolution of Semantic Heterogeneity in GIS:
An Ontology Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
Shrutilipi Bhattacharjee, Soumya K. Ghosh
Contents XXV
Human-Computer Interfacing
Effects of Robotic Blinking Behavior for Making Eye Contact with
Humans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
Mohammed Moshiul Hoque, Quazi Delwar Hossian, Kaushik Deb
Improvement and Estimation of Intensity of Facial Expression
Recognition for Human-Computer Interaction . . . . . . . . . . . . . . . . . . . . . . . . . 629
Kunal Chanda, Washef Ahmed, Soma Mitra, Debasis Mazumdar
Cognitive Activity Recognition Based on Electrooculogram Analysis . . . . . . . 637
Anwesha Banerjee, Shreyasi Datta, Amit Konar, D.N. Tibarewala,
Janarthanan Ramadoss
Detection of Fast and Slow Hand Movements from Motor Imagery EEG
Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
Saugat Bhattacharyya, Munshi Asif Hossain, Amit Konar,
D.N. Tibarewala, Janarthanan Ramadoss
1 Introduction
The Strapdown Inertial Navigation System (SINS) [1] has special advantages and is
widely being adopted for the accurate positioning and navigation of missiles,
aeroplanes, ships, and railway vehicles etc. INS is advantageous when compared with
(Global Positioning System) GPS, as it is unaffected by the external sources.
However the INS output is subjective to the errors in the data which is supplied to the
system during long durations and also can be due to inaccurate design and
construction of the system components. Three types of errors are mainly responsible
for the rate at which navigation error grow over long distances of time and these are
the initial alignment errors, sensor errors and computational errors. The initial
alignment [2-5] is the main key technology in SINS and depending upon the sensors
configuration it must provide accurate result. The main purpose of initial alignment is
to get the initial coordinate transformation matrix from the body frame to computer
coordinates frame and the misalignment angle is considered to be zero during the
mathematical modeling. The performance of inertial navigation system is affected by
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 1
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_1, © Springer International Publishing Switzerland 2014
2 B. Malakar and B.K. Roy
the alignment accuracy directly as well as initial alignment time which is mainly
responsible for the rapid response capability. Therefore there is a requirement of
shorter alignment time with a high precision in initial alignment.
At present, Kalman filtering [2] techniques are basically used in order to achieve
the initial alignment of inertial navigation system due to its simplicity and also
considered as an effective method. But in conventional Kalman filtering technique
one must have the future knowledge of the mathematical models [6] and the noise
statistics must also be studied and considered. In case of conventional Kalman filter, it
is unable to provide a better and efficient result.
After introduction in section 1, section 2 briefly describes initial SINS alignment.
The description of estimation algorithms are given in section 3. Section 4 deals with
the dynamic modeling used for the simulation. Section 5 deals with the dynamic
simulation of fine alignment. Section 6 presents the results, discussions and the
comparison between the proposed algorithms. Finally, we conclude the paper in
section 7.
An INS determines the position of the body frame by the integration of measured
acceleration and rotation rate. Since the position is always relative to the starting
position and for this reason the INS must know the position, attitude and heading
before the navigation begins. The position is assumed to be known but the attitude
and heading needs to be determined and is the process of alignment [7], [8].
In the coarse alignment stage the measured acceleration and rotation rate in body
frame is compared with the gravity vector G along with the Earth’s rotation rate and it
directly estimates transformation matrix of carrier coordinates to geographical
coordinates. It estimates the attitude and heading accurately enough to justify the
small angle approximations made in the error model, and hence allow fine alignment
to use an adaptive filter using the error model to obtain a more precise alignment
which helps to determine the direction cosine matrix or attitude matrix Cnb relating the
navigation frame (n) and body frame (b). For determining the orientation of the body
frame INS makes the use of the accelerometers and gyroscopes for measurement with
respect to a reference frame and it is required for the estimation of the measured value
of Cnb .
The basic of alignment in SINS is discussed which plays an important role to
improve the initial alignment. Many algorithms are being proposed recently in the
literature for estimation and optimization of errors for SINS [9-11]. A theoretical
background of the proposed algorithms is discussed in the next section.
Application of Bilinear Recursive Least Square Algorithm for Initial Alignment of SINS 3
3 Theoretical Background
Here k is the iteration number, x(k) denotes the input signal, y(k) is the adaptive-
filter output signal, and d(k) defines the desired signal [12]. The error signal e(k) is
calculated as d(k)-y(k). In order to determine the proper updating of the filter
coefficients, the error signal is then used to form a performance (or objective)
function that is required by the adaptation algorithm. The minimization of the
objective function implies that the adaptive-filter output signal is matching the desired
signal in some sense.
Here y(k) is the adaptive-filter output and for this case, the signal information
vector is defined by (2) and (3).
φ(k) =[ x(k) x(k−1)... x(k−M) y(k−1) y(k−2) y(k−N) x(k)y(k−1) x(k−I)y(k−L+1) x(k−I)y(k−L)]
T
(2)
T
θ (k) = b0 (k) b1(k)... bM (k) − a1(k) − a2 (k)... − aN (k) C0,1(k) CI , L −1(k) CI , L(k) (3)
4 Dynamic Modeling
The model of a local level NED (North-East-Down) [2] frame is used in this paper as
the navigation frame. From the alignment point of view the East and North axes are
referred to as leveling axes and “Down” axis are called the azimuth axis. The position
and velocity errors are ignored. The state equations [2] can be represented as given.
P1 P2 P2 (4)
X = AX + FW = X + W
05X 5 05X 5 05X 5
where X = [ δ V N δ V E φ N φ E φ D ∇ x ∇ y ε x ε y ε z ]T
δ VN ,δ VE are the east and north velocity error respectively and ∇ x ,∇ y for accelerometer error.
φN , φE are the two level milianment angles and φU for the misalignment angle.
ε x ,ε y ,ε z are the gyro error and ωie for the Earth rotation rate.
The observation equation for Initial alignment of SINS is given by,
Y = TX + V (5)
δVN 02X 8
Application of Bilinear Recursive Least Square Algorithm for Initial Alignment of SINS 5
W = [ w a x w a y w g x w g y w g z ]T
and,
C 11 C 12 0 0 0
C 21 C 22 0 0 0
P2 = 0 0 C C C
11 12 13
0 0 C 21 C 22 C 23
0 0 C 31 C 32 C 33
With the help of transformation matrix i.e. posture matrix P2 the NED coordinate
system can be transformed into body coordinate system and is given by,
C 11 C 12 C 13
P2 = C 21 C 22 C 23
C
31 C 32 C 33
C11 = cos(φ N )cos(φU ) - sin(φE )sin(φN )sin(φU ), C12 = sin(φE )sin(φN )cos(φU ) + cos(φN )sin(φU ), C13 = -sin(φN )cos(φE ),
C21 = -cos(φE )sin(φU ),C22 = cos(φE )cos(φU ), C23 = sin(φE ),
C31 = sin(φ N )cos(φU ) + cos(φ N )sin(φE )sin(φU ), C32 = sin(φ N )sin(φU ) - cos(φ N )sin(φE )cos(φU ), C33 = cos(φN )cos(φE )
In order to evaluate the performance of the proposed Volterra RLS and Bilinear RLS
algorithm, an example of a static self aligned stationary-base filtering program is
6 B. Malakar and B.K. Roy
considered, which use the NED coordinate frame [2]. The initial parameters used for
the simulation are as follows.
Inertial navigation system location of longitude 125°, north latitude 45°, the initial
value X0 of the state variable X are assigned to zero; P0, Q and R are taken as the
corresponding value of the middle-precision gyroscopes and accelerometers; and the
initial misalignment δ V N δ V E φ N φ E φ U ∇ x ∇ y ε x ε y ε z angles φN φE φU are
taken as 1°; gyro drift are often taken as 0.02°/h, random drift taken as 0.01°/h; and
accelerometers taken as 100µg, velocity errors taken as 0.1m/s, then
X0 = diag{0,0,0,0,0,0,0,0,0,0}; P0 = diag [(0.1 m/s)2, (0.1 m/s)2, (1°)2, (1°)2, (1°)2, (100µg)2,
(100µg)2, (0.02°/h)2, (0.02°/h)2, (0.02°/h)2]; Q = diag [(50µg)2 ,(50µg)2, (0.01°/h)2, (0.01°/h)2,
(0.01°/h)2, 0, 0, 0, 0, 0, 0]; R = diag [([(0.1 m/s)2, (0.1 m/s)2)]
Fig. 2. Azimuth MSE plot vs time samples using BRLS and VRLS
Fig. 3. Azimuth error plot vs time samples using BRLS and VRLS
Fig. 4. Azimuth estimation plot vs time samples using BRLS and VRLS
Application of Bilinear Recursive Least Square Algorithm for Initial Alignment of SINS 7
Fig. 5. Azimuth MSE plot vs time samples using BRLS and VRLS
Fig. 6. Azimuth error plot vs time samples using BRLS and VRLS
Fig. 7. Azimuth estimation plot vs time samples using BRLS and VRLS
In this section the various outputs obtained from the MATLAB on a laptop with
1GB RAM and 1.50 GHz processor, are shown and their performances characteristic
are discussed in the next section.
The two algorithms are compared to each other by considering two different values of
noises i.e. for 0.05 rand(n) and 0.03 rand(n). The performance of two algorithms for
estimating Azimuth angle error is shown in Fig. 3, Fig. 6. Azimuth error estimation is
shown in Fig. 4 and Fig. 7 and the MSE Azimuth angle error is shown in Fig. 2 and
Fig. 5. The comparison performance is given in Table 1.
8 B. Malakar and B.K. Roy
Table 1. Performance of The Proposed Algorithm for Estimating Azimuth angle Error
7 Conclusions
This paper reports a research on the adaptive filtering technique applied to improve
the estimation accuracy of SIN’S azimuth angle error. From the simulation result it is
observed that the BRLS filtering algorithm is one of the better and also an effective
algorithm for the initial alignment accuracy and convergence speed of strap down
inertial navigation system.
References
1. Titterton, D., Weston, J. In: Strapdown Inertial Navigation Technology, Institution of
Electrical Engineers (2004)
2. Wang, X., Shen, G.: A Fast and Accurate Initial Alignment Method for Strapdown Inertial
Navigation System on Stationary Base. Journal of Control Theory and Applications, 145–
149 (2005)
3. Sun, F., Zhang, H.: Application of a New Adaptive Kalman Filitering Algorithm in Initial
Alignment of INS. In: Proceedings of the IEEE International Conference on Information
and Automation, Beijing, China, pp. 2312–2316 (2011)
4. Gong-Min, V., Wei-Sheng, Y., De-Min, X.: Application of simplified UKF in SINS initial
alignment for large misalignment angles. Journal of Chinese Inertial Technology 16(3),
253–264 (2008)
5. Anderson, B.D.O., Moore, J.B.: Optimal Filtering. Prentice-Hall Inc., Englewood Cliffs
(1979)
6. Savage, P.G.: A unified mathematical framework for strapdown algorithm design. J. Guid.
Contr. Dyn. 29, 237–249 (2006)
7. Silson, P.M.: Coarse alignment of a ship’s strapdown inertial attitude reference system
using velocity loci. IEEE Transcript Instrumentation Measurement (2011)
8. Li, Q., Ben, Y., Zhu, Z., Yang, J.: A Ground Fine Alignment of Strapdown INS under a
Vibrating Base. J. Navigation 1, 1–15 (2013)
9. Wu, M., Wu, Y., Hu, X., Hu, D.: Optimization-based alignment for inertial navigation
systems: Theory and algorithm. Aeros. Science & Technology, 1–17 (2011)
10. Salychev, O.S.: Applied Estimation Theory in Geodetic and Navigation Applications.
Lecture Notes for ENGO 699.52, Department of Geomatics Engineering, University of
Calgary (2000)
11. Julier, S.J., Uhlmann, J.K.: Unscented filtering and nonlinear estimation. In: Proc. of the
IEEE Aerospace and Electronic Systems, pp. 401–422 (2004)
12. Paulo, S.R.: Adaptive Filtering Algorithms and Practical Implementation, 3rd edn.
Springer
Time-Domain Solution of Transmission through
Multi-modeled Obstacles for UWB Signals
Abstract. In this work, the time-domain solution for transmission through mul-
ti-modeled obstacles has been presented. The transmission through dielectric
wedge followed by a dielectric slab has been analyzed. The analytical time-
domain transmission and reflection coefficients for transmission through the
conductor-dielectric interface, considering oblique incidence, are given for both
soft and hard polarizations. The exact frequency-domain formulation for trans-
mitted field at the receiver has been simplified under the condition of low-loss
assumption and converted to time-domain formulation using inverse Laplace
transform. The time-domain results have been validated with the inverse fast
Fourier transform (IFFT) of the corresponding exact frequency-domain results.
Further the computational efficiency of both the methods is compared.
1 Introduction
In recent years, research in ultra wideband (UWB) propagation through indoor scena-
rio where non line of sight (NLOS) communication is more dominant, has received
great attention because of unique properties of UWB communication like resilient to
multipath phenomena, good resolution and low power density. In radio propagation
of UWB signals, especially in NLOS communication in deep shadow regions, trans-
mitted field component proves to be very significant [1, 2]. Considering the huge
bandwidth (3.1-10.6 GHz) of UWB signals, it is more efficient to study UWB propa-
gation directly in time-domain (TD) where all the frequencies are treated simulta-
neously [3, 4]. The TD solution of transmitted field through a dielectric slab was
presented in [5]. A simplified TD model for UWB signals transmitting through a
dielectric slab was presented in [6, 7]. The TD solutions for the reflection and trans-
mission through a dielectric slab were presented in [8].
In this paper, we present an approximate TD solution for the field transmitted
through multi-modeled obstacles under the low-loss assumption, for UWB signals. In
other words, an accurate TD solution for modeling the transmission of UWB signals
through multi-modeled obstacles is presented. Transmission model is called multi-
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 9
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_2, © Springer International Publishing Switzerland 2014
10 S. Soni, B. Bansal, and R.S. Rao
2 Propagation Environment
The propagation environment is shown in Fig. 1, where a single dielectric wedge is
followed by a dielectric slab. The parameters ri , i = 1, 2...,5 are the distances traversed
by the transmitted field through the structure from the transmitter (Tx) up to the re-
ceiver (Rx).
Angles θ1 , θ3 , θ5 , θ7 are the incidence angles with θ2 ,θ4 ,θ6 , θ8 as the angles of re-
fraction at points ‘P’, ‘Q’, ‘R’ and ‘S’ respectively and a i is the internal wedge angle.
The parameters ht and hr are the heights of the transmitter and the receiver, hw is
the height of the wedge and hs is the height of the slab. Tx is at a distance d1 from
the wedge, d 2 is distance between wedge and slab, d 3 is width of slab and d 4 is
distance between slab and Rx.
TD Solution of Transmission through Multi-Modeled Obstacles for UWB Signals 11
rh ( t ) = K hδ ( t ) +
4 kh K
e − pt h X h +
(1 − X h ) − ( pt ) X h
(1 − k )
(3)
2
h 2 2 Kh 4
K pt
− h 1 − kh
, kh = (cos θi / cosψ t )(1 / ε r 2 ) , and p = τ / 2 with
with X h = e 2 , K =
h
1 + kh
τ = σ /ε 2 .
where, ε 2 and ε r2 are dielectric permittivity and relative dielectric permittivity of
the dielectric medium respectively. The TD transmission coefficient for a soft pola-
rized wave propagating from conductor to dielectric medium is given as
cos θi
γ s ( t ) ≈ δ ( t ) − rs ( t ) (4)
cosψ t
rs ( t ) = K sδ ( t ) +
4k s K
e − pt s X s +
(1 − X s ) − ( pt ) X s
(1 − k )
where, with
2
s 2 2Ks 4
K pt
1 − ks
( )
− s
Xs = e 2 , Ks = , ks = ( cosψ t / cos θi ) 1/ ε r 2 .
1 + ks
12 S. Soni, B. Bansal, and R.S. Rao
{( )}
4 5
ERX (ω ) = ( Ei (ω ) / rtotal (ω ) )
∏ ∏
Ti , s ,h (ω ) exp( − jk0 r1 ) exp − α ej (ω ) + j β ej (ω )
i =1 j =2
(5)
5 4
with rtotal (ω ) = ri and Ttotal ,s,h (ω ) = ∏ Ti ,s,h (ω ) where Ti ,s ,h (ω ) , i = 1, 2..., 4 are
i =1 i =1
the FD transmission coefficients with respect to points ‘P’, ‘Q’, ‘R’ and ‘S’ (See Fig.
1). α ej (ω ), β ej (ω ) are total effective attenuation constants and phase shift constants
th
for different j regions [12]. Actual FD path-loss expression from (5) is given by
∏ exp{− (α )}
5
Ltotal ,s ,h (ω ) = exp( − jk0 r1 ) ej (ω ) + j β ej (ω ) (6)
j =2
Now the corresponding TD expression for the received field at Rx based on the FD
transmission model of [2] is as follows
e (t )
eRX (t ) ≈ i ∗ Γ1,s,h (t ) ∗ Γ 2,s ,h (t ) ∗ Γ3,s,h (t ) ∗ Γ 4,s,h (t ) ∗ ltotal ,s ,h (t ) (7)
rtotal
For loss tangent much less than unity (σ / ωε << 1) , the FD path-loss expression (6)
reduces to the following approximate form with constant values of angles of refrac-
tion and along a single effective path for transmission:
σ εr
Ltotal ,s,h (ω ) ≈ exp ( − jk0 ( r1 + r3 + r5 ) ) exp − jω με 1 + r2 + d 3
2 jωε ε r − sin 2 θ5
(8)
Here r1 + r3 + r5 is the total distance travelled by the field in free space. ε and σ are
the parameters of the dielectric mediums in Fig. 1 (assuming same for wedge and
slab). The term ltotal ,s ,h (t ) in (7) is then given by
TD Solution of Transmission through Multi-Modeled Obstacles for UWB Signals 13
( )
ltotal ,s,h (t ) ≈ exp − μ σ r2 + d 3
ε 2
εr
ε r − sin 2 θ5
(9)
εr r + r + r
δ t − με r2 + d 3 ∗ δ t − 1 3 5
ε
r sin θ5
−
2
c
This approximated TD path-loss expression will be used in (7) to compute the TD
transmitted field and the accuracy will be proved by the comparison of TD transmit-
ted field with the IFFT of the exact FD results, as shown in next section.
Fig. 2 shows the transmitted field through the propagation environment discussed
in section 2, for both hard and soft polarizations. The transmitted field at Rx suffers
no distortion in shape in comparison to the shape of the excited UWB pulse. This is
because of small magnitude of the loss tangent with respect to unity. However, the
amplitude of the transmitted field is attenuated because of the transmission loss
through the dielectric mediums.
Fig. 2. Transmitted field through ‘dielectric wedge followed by a dielectric slab’, with glass
[15, 16]
14 S. Soni, B. Bansal, and R.S. Rao
The TD results for the transmitted field for both the polarizations are in excellent
agreement with corresponding IFFT of exact FD results, thus providing validation to
the proposed TD solution.
Fig. 3 shows the effect of varying Rx position (changing distance d 4 in Fig. 1) on
transmitted field at the receiver. Transmitted field gets more attenuated as Rx moves
away from the obstacles. The results for soft and hard polarized fields come closer to
each other as the distance d 4 increases. Also the TD results match closely with the
IFFT-FD results.
Fig. 3. Transmitted field through ‘dielectric wedge followed by a dielectric slab’ for different
receiver positions, with glass [15, 16]
Fig. 4. Transmitted field through ‘dielectric wedge followed by a dielectric slab’ for different
dielectric materials, with wood [6], drywall [15] and glass [15, 16]
Fig. 4 shows transmitted field at the receiver for different dielectric materials. The
TD results are in good agreement with the IFFT-FD results. It can be seen that as the
value of loss tangent decreases, a better agreement is achieved between the TD and
IFFT-FD results.
TD Solution of Transmission through Multi-Modeled Obstacles for UWB Signals 15
A comparison between the computation times of the IFFT-FD method and the pro-
posed TD solution for propagation profile considered in Fig. 1 is presented in Table 2.
The presented results in Table 2 establish that the proposed TD analysis is computa-
tionally very efficient in comparison to the IFFT-FD solution.
The two main reasons for such a significant reduction in the computational time in
TD are: (i) the efficient convolution technique [9] due to which few number of time
samples suffice to provide accurate results. (ii) Approximation of the multiple trans-
mission paths in FD by a single effective path for low-loss dielectric case.
Given the excellent agreement between proposed TD solution and IFFT of FD so-
lution, it can be concluded that the proposed method is accurate for low loss tangent
values in the UWB bandwidth. The presented work also establishes that the proposed
TD solution is computationally more efficient than the conventional IFFT-FD me-
thod.
5 Conclusion
An analytical TD solution has been presented for the transmitted field through
multi-modeled obstacles made up of low-loss dielectric materials. Analytical TD
transmission and reflection coefficients for transmission through an interface between
conductor and dielectric mediums are presented for soft and hard polarizations. The
results of the proposed TD solution are validated against the corresponding IFFT-FD
results and the computational efficiency of two methods is compared. The TD solu-
tion outperforms the IFFT-FD analysis in terms of the computational efficiency. The
TD solution is vital in the analysis of UWB communication as it can provide a fast
and accurate prediction of the total transmitted field in microcellular and indoor prop-
agation scenarios.
References
1. de Jong, Y.L.C., Koelen, M.H.J.L., Herben, M.H.A.J.: A building-transmission model for
improved propagation prediction in urban microcells. IEEE Trans. Veh. Technol. 53(2),
490–502 (2004)
2. Soni, S., Bhattacharya, A.: An analytical characterization of transmission through a build-
ing for deterministic propagation modeling. Microw. Opt. Techn. Lett. 53(8), 1875–1879
(2011)
3. Karousos, A., Tzaras, C.: Multiple time-domain diffraction for UWB signals. IEEE Trans.
Antennas Propag. 56(5), 1420–1427 (2008)
16 S. Soni, B. Bansal, and R.S. Rao
4. Qiu, R.C., Zhou, C., Liu, Q.: Physics-based pulse distortion for ultra-wideband signals.
IEEE Trans. Veh. Technol. 54(5), 1546–1555 (2005)
5. Chen, Z., Yao, R., Guo, Z.: The characteristics of UWB signal transmitting through a lossy
dielectric slab. In: Proc. IEEE 60th Veh. Technol. Conf., VTC 2004-Fall, Los Angeles,
CA, USA., vol. 1, pp. 134–138 (2004)
6. Yang, W., Qinyu, Z., Naitong, Z., Peipei, C.: Transmission characteristics of ultra-wide
band impulse signals. In: Proc. IEEE Int. Conf. Wireless Communications, Networking
and Mobile Computing, Shanghai, pp. 550–553 (2007)
7. Yang, W., Naitong, Z., Qinyu, Z., Zhongzhao, Z.: Simplified calculation of UWB signal
transmitting through a finitely conducting slab. J. Syst. Eng. Electron. 19(6), 1070–1075
(2008)
8. Karousos, A., Koutitas, G., Tzaras, C.: Transmission and reflection coefficients in time-
domain for a dielectric slab for UWB signals. In: Proc. IEEE Veh. Technol. Conf., Singa-
pore, pp. 455–458 (2008)
9. Brigham, E.O.: The Fast Fourier transform and Its Applications. Prentice-Hall, Englewood
Cliffs (1988)
10. Sevgi, L.: Numerical Fourier transforms: DFT and FFT. IEEE Antennas Propag.
Mag. 49(3), 238–243 (2007)
11. Balanis, C.A.: Advanced engineering electromagnetic. Wiley, New York (1989)
12. Tewari, P., Soni, S.: Time-domain solution for transmitted field through low-loss dielectric
obstacles in a microcellular and indoor scenario for UWB signals. IEEE Trans. Veh. Tech-
nol. (2013) (under review)
13. Barnes, P.R., Tesche, F.M.: On the direct calculation of a transient plane wave reflected
from a finitely conducting half space. IEEE Trans. Electromagn. Compat. 33(2), 90–96
(1991)
14. Tewari, P., Soni, S.: A comparison between transmitted and diffracted field in a microcel-
lular scenario for UWB signals. In: Proc. IEEE Asia-Pacific Conf. Antennas Propag., Sin-
gapore, pp. 221–222 (2012)
15. Muqaibel, A., Safaai-Jazi, A., Bayram, A., Attiya, A.M., Riad, S.M.: Ultrawideband
through-the-wall propagation. IEE Proc.-Microw. Antennas Propag. 152(6), 581–588
(2005)
16. Jing, M., Qin-Yu, Z., Nai-Tong, Z.: Impact of IR-UWB waveform distortion on NLOS lo-
calization system. In: ICUWB 2009, pp. 123–128 (2009)
Indexing and Retrieval of Speech Documents
1 Introduction
Due to rapid advancement in technology, there has been explosive growth in the
generation and use of multimedia data, such as video, audio, and images. A lot
of audio and video data is generated by internet, mobile devices, TV and radio
broadcast channels. Indexing systems are desirable for managing and supporting
usage of large databases of multimedia. Audio indexing finds applications in
digital libraries, entertainment industry, forensic laboratories and virtual reality.
Many kinds of indexing schemes have been developed, and studied by re-
searchers working in this field [13]. An approach for indexing audio data is the
use of text itself. Transcripts from the audio data are generated, which are used
for indexing. This approach is effective for indexing broadcast news, video lec-
tures, spoken documents, etc., where the clean speech data is available. The
retrieval from such indexing system is performed using keywords as a query.
For effective use of multimedia data, the users should be able to make content
based queries or queries-by-example, which are unrestricted and unanticipated.
Content based retrieval systems accept data type queries i.e., hummed, sung or
original clip of a song for a song retrieval. The methods used for audio indexing
can be broadly classified as under:
– Signal parameter based systems [1,3,9,14] - In this scheme, the signal statis-
tics such as mean, variance, zero crossing rate, autocorrelation, histograms
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 17
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_3, c Springer International Publishing Switzerland 2014
18 P.K.P. Singh et al.
Matching Items
Results
From each frame, 13 MFFCCs are extracted using 24 filter banks. This results
in 1000 MFCC vectors for 10 seconds of speech. These 1000 vectors are then
converted to a vector codebook using k-means clustering [12].
Fig. 2 shows the basic steps for building a vector quantization codebook and
extraction of code book indices. Training data consists of 1000 MFCC vectors.
The training data set is used to create an optimal set of codebook vectors for
representing the spectral variability observed in the training data set. A centroid
computation procedure is used as the basis of partitioning the training data set
20 P.K.P. Singh et al.
Training
data-set Clustering
Code book
algorithm
Vectors
(k-means)
Nearest-
Code book
neighbor
Indices
Labeling
Building the Index. Phase 2 (Indexing / Lookup phase) does the task of in-
dexing the features extracted from the speech database. This phase also performs
a lookup in the index to retrieve the best matching items for a query vector. The
matching can be done in various ways, say, exact matching, nearest neighbour
matching, etc. which depends on the application and the efficiency required.
In this work, MFCC feature vectors derived from speech consists of 13 di-
mensions. Various data structures can be used for multi-dimensional indexing,
for example, quad tree, k -d tree, optimized k -d tree. In this work, the index is
created from the codebook in the form of a k -d tree. The discriminating key for
each level of the k -d tree is selected according to Bentleys definition of the k -d
tree i.e., D = L mod k + 1, where D is the discriminating key number for level
L and the root node is defined to be at level zero. However, the selection of the
partition value is done in the way suggested by Friedman [4]. The partition value
is selected as the median value in the dimension of the discriminator(D).
The codebooks created in this work have 32 codebook vectors. Each of these
vectors is assigned an index number (1-32). This numbering of the codebook
entries is utilized in Phase 3. We are required to preserve this numbering even
when the codebook is converted to a k-d tree index. This is achieved by append-
ing the index number as an additional dimension to the codebook vectors while
creating the k-d tree. This 14th dimension is never used as a discriminating key
while building the k-d tree index. The Algorithm 1 is used for k-d tree creation
in this work.
The procedure median(j,subfile) returns median of the jth key values. The pro-
cedures make terminal and make non terminal store their parameters as values
of the node in the k-d tree and return a pointer to that node. The lef tsubf ile
and rightsubf ile procedures, partition the file along the dth key with respect
Indexing and Retrieval of Speech Documents 21
to the partition value p and return left and right subfiles, respectively. Heap-
sort algorithm having complexity of O(NlogN), is used to compute the median
as well as the left and right sub file at each level. Thus the time complexity
of building a k-d tree from N vectors is O(N log 2 N ). The searching algorithm
proposed by Friedman [4] is used in this work for searching the k-d tree. This
search algorithm is also a part of Phase 2 of the system.
k-means
Longest
Codebook Common
k-d tree
Vectors Subsequence
Length
index numbers are then concatenated to form a sequence of index numbers. The
retrieval method finds out that clip in the database, whose MFCC vectors follow
the similar codebook vector sequence, as that of the query clip. Steps in retrieval
method are as follows:
1. After building the codebook for a speech document in the database, the
MFCC vectors are scanned, in order of their occurrence, to determine the
index number (1-32) of the nearest codebook vector.
2. By concatenating these index numbers obtained in the above step, we obtain
an approximate sequence of sounds present in the clip.
3. MFCC vectors are extracted from the query speech clip provided.
4. By using the same technique of nearest neighbour search among the code-
book vectors, we can obtain another sequence of index numbers. The length
of the longest common subsequence between the sequences obtained in steps
2 and 4 is determined.
5. The length of the longest common subsequence obtained in step 4, can act
as the similarity measure while comparison with other clips in the database,
since the process of sequence determination will be repeated for all the clips
in the database.
3 Performance Evaluation
The database used for evaluation, consists of 3 hours of speech recorded by a sin-
gle male speaker. The speech consists of articles about History of India available
on Wikipedia. This recorded speech data was divided into 1080 speech docu-
ments, each of duration 10 seconds. Proposed retrieval approach was evaluated
by 200 queries recorded from 5 male and 5 female speakers. In this work, we have
asked each speaker to speak 20 speech documents from the speech database at
random. The retrieval performance of the proposed method is shown in Table
1. From the results presented in Table 1, it is observed that the accuracy of
retrieval is 88% for the male speaker queries and 3% for female speaker queries.
The reason behind the queries being missed is that the speakers are different,
and thus their utterances of the similar text may differ. These differences account
for the differences in the sequences we are matching. Therefore, if the utterance
differs at many places, the sequences get altered immensely resulting in query
misses. The gender effect on the MFCC vectors, also plays a role in poor retrieval
performance of the female speaker queries.
Indexing and Retrieval of Speech Documents 23
Speaker Gender
Male Female
Speaker Id Queries matched Speaker Id Queries matched
1 18 6 0
2 20 7 0
3 15 8 2
4 20 9 0
5 15 10 1
The poor retrieval performance for the queries of female speakers may be due
to the influence of gender characteristics. As the database contains only male
speaker’s speech utterances, therefore in case of female speakers queries, the
indices generated by the queries is entirely different due to differences in shape
and size of the vocal tract between the genders.
In this paper, a k-d tree based speech document indexing system has been pro-
posed. For retrieving the desired speech document for a given query, the sequence
of codebook indices, generated by the speech document and the query are com-
pared using LCS approach. By computing the LCS scores between a query and
all the speech documents, the retrieval system retrieves the desired document
based on the highest LCS score. For evaluating the proposed retrieval approach,
3 hours of speech database recorded by a male speaker was used. The perfor-
mance accuracy in the retrieval process is found to be better for the queries
spoken by male speakers. In the case of female speakers, the performance is
observed to be very poor.
In this work, the codebooks are generated from each 10 second speech docu-
ment. The codebook captures the local characteristics of speech present in that
10 second segment. Therefore when we generate the indices from this codebook,
the sequence of indices may be differ for different speakers even if the spoken
message content is same. This problem may be addressed by building a single
codebook from large amount of speech data instead of smaller codebooks for
each 10 second segment. From the generalized codebook, if we derive the se-
quence of indices for the speech document and the query, the retrieval accuracy
may be improved. It may also resolve the problem risen due to gender dependent
query. Query gender dependency may also be resolved by adapting appropriate
vocal tract length normalization(VTLN) feature transformation techniques in
the codebook building procedure. Also, in the current experimental setup the
queries given were similar to clips present in the speech database. In future, work
can be done to perform retrieval based on queries having only few keywords and
other connecting words.
24 P.K.P. Singh et al.
References
1. Cha, G.-H.: An Effective and Efficient Indexing Scheme for Audio Fingerprinting.
In: Proceedings of the 2011 Fifth FTRA International Conference on Multimedia
and Ubiquitous Engineering, Washington, DC, USA, pp. 48–52 (2011)
2. Chen, A.L.P., Chang, M., Chen, J., Hsu, J.-L., Hsu, C.-H., Hua, S.Y.S.: Query by
music segments: an efficient approach for song retrieval. In: 2000 IEEE Interna-
tional Conference on Multimedia and Expo (2000)
3. Foote, J.T.: Content-Based Retrieval of Music and Audio. In: Proceedings of SPIE,
Multimedia Storage and Archiving Systems II, pp. 138–147 (1997)
4. Friedman, J.H., Bentley, J.L., Finkel, R.A.: An Algorithm for Finding Best Matches
in Logarithmic Expected Time. ACM Transactions on Mathematical Software 3(3),
209–226 (1977)
5. Hirschberg, D.S.: A linear space algorithm for computing maximal common sub-
sequences. ACM Communucations, 341–343 (1975)
6. Deller Jr., J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech
Signal. IEEE Press (2000)
7. Kosugi, N., Nishihara, Y., Sakata, T., Yamamuro, M., Kushima, K.: A practical
query-by-humming system for a large music database. In: Proceedings of the Eighth
ACM International Conference on Multimedia, pp. 333–342 (2000)
8. Lemström, K., Laine, P.: Musical information retrieval using musical parameters.
In: Proceedings of the 1998 International Computer Music Conference (1998)
9. Li, G., Khokhar, A.A.: Content-based indexing and retrieval of audio data using
wavelets. In: 2000 IEEE International Conference on Multimedia & Expo, pp. 885–
888 (2000)
10. Lu, L., You, H., Zhang, H.-J.: A new approach to query by humming in music
retrieval. In: ICME 2001, pp. 595–598 (2001)
11. Maier, D.: The Complexity of Some Problems on Subsequences and Superse-
quences. J. ACM, 322–336 (1978)
12. Rabiner, L., Juang, B.-H.: Fundamentals of speech recognition. Prentice-Hall, Inc.
(1993)
13. Rao, K.S., Pachpande, K., Vempada, R.R., Maity, S.: Segmentation of TV broad-
cast news using speaker specific information. In: NCC 2012, pp. 1–5 (2012)
14. Subramanya, S.R., Youssef, A.: Wavelet-based Indexing of Audio Data in Au-
dio/Multimedia Databases. In: Proceedings of MultiMedia Database Management
Systems (1998)
An Improved Filtered-x Least Mean Square Algorithm
for Acoustic Noise Suppression
Asutosh Kar1, Ambika Prasad Chanda1, Sarthak Mohapatra1, and Mahesh Chandra2
1
Department. of Electronics and Telecommunication Engineering
Indian Institute of Information Technology, Bhubaneswar, India
2
Department of Electronics and Communication Engineering
Birla Institute of Technology, Mesra, India
{asutosh,b210038}@iiit-bh.ac.in, [email protected],
[email protected]
Abstract. In the modern age scenario noise reduction is a major issue, as noise
is responsible for creating disturbances in day-to-day communication. In order
to cancel the noise present in the original signal numerous methods have been
proposed over the period of time. To name a few of these methods we have
noise barriers and noise absorbers. Noise can also be suppressed by continuous
adaptation of the weights of the adaptive filter. The change of weight vector in
adaptive filters is done with the help of various adaptive algorithms. Few of the
basic noise reduction algorithms include Least Mean Square algorithm,
Recursive Least Square algorithm etc. Further we work to modify these basic
algorithms so as to obtain Normalized Least Mean Square algorithm, Fractional
Least Mean Square algorithm, Differential Normalized Least Mean Square
algorithm, Filtered-x Least Mean Square algorithm etc. In this paper we work to
provide an improved approach for acoustic noise cancellation in Active Noise
Control environment using Filtered-x LMS (FXLMS) algorithm. A detailed
analysis of the algorithm has been carried out. Further the FXLMS algorithm
has been also implemented for noise cancellation purpose and the results of the
entire process are produced to make a comparison.
Keywords: adaptive filter, active noise control, Least Mean Square, Mean
Square Error, FXLMS.
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 25
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_4, © Springer International Publishing Switzerland 2014
26 A. Kar et al.
adaptively calculate the signal that cancels the noise. In this paper “feed-forward
ANC” approach is implemented to cancel the noise with the help of FxLMS
algorithm. This is because the amount of noise reduction achieved by feed-forward
ANC system is more than that of feedback ANC system. A single channel feed-
forward ANC system comprises of a reference sensor, a control system, a cancelling
loudspeaker and an error sensor. The primary noise present is measured by the
reference sensor and is cancelled out around the location of the error microphone by
generating and combining an anti-phase cancelling noise that is correlated to the
spectral content of the unwanted noise [3]. The reference sensor produces a reference
signal that is “feed-forward” to the control system to generate “control signals” in
order to drive the cancelling loudspeaker to generate the cancelling noise. The error
microphone measures the residual noise after the unwanted noise and controls signal.
It combines and sends an error signal back to the controller to adapt the control
system in an attempt to further reduction of error. The control system adaptively
modifies the control signal to minimize the residual error. The most famous
adaptation algorithm for ANC systems is the FXLMS algorithm [4, 5] as mentioned
earlier, which is a modified and improved version of the LMS algorithm [6-10].
Although FXLMS algorithm is widely used due to its stability but it has slower
convergence. Various techniques have been proposed to improve the convergence of
FXLMS algorithm with some parameter trade-offs. Other versions of the FXLMS
include Filtered-x Normalized LMS (FXNLMS) [11], Leaky FXLMS [12], Modified
FXLMS (MFXLMS) [13-15] etc. However, the common problem with all of these
algorithms is the slow convergence rate, especially when there is large number of
weights [16]. To overcome this problem, more complex algorithms such as Filtered-x
Recursive Least Square (FXRLS) [17] can be used. These algorithms have faster
convergence rate compared to the FXLMS; however they involve matrix
computations and their real-time realizations might not be cost effective.
2 Problem Formulation
Acoustic noise problems are not only based on high frequency noise but many are
also dominated by low frequency noise. Various passive noise control techniques
such as noise barriers and absorbers do not work efficiently in low frequency
environments. ANC works very efficiently in case of low frequency noise & it is also
cost effective than bulky, heavy barriers and absorbers.
The basic LMS algorithm fails to perform well in the ANC framework. This is due
to the assumption made that the output of the filter is the signal perceived at the error
microphone, which is not the case in practice. The presence of the A/D, D/A
converters, actuators and anti aliasing filter in the path from the output of the filter to
the signal received at the error microphone cause significant change in the output
signal. This demands the need to incorporate the effects of this secondary path
function in the algorithm. But the convergence of the LMS algorithm depends on the
phase response of the secondary path, exhibiting ever-increasing oscillatory behavior
as the phase increases and finally going unstable at 90° [11]. The solution to this
An Improved Filtered-x Least Mean Square Algorithm for Acoustic Noise Suppression 27
problem was either to employ an inverse filter in the cancellation path or to introduce
a filter in the reference path, which is ideally equal to the secondary path impulse
response. The former technique is referred to as the “filtered-error” approach, while
the later is now known as the “filtered-reference” method, more popularly known as
the FXLMS algorithm. The FXLMS solution is by far the most widely used due to its
stable as well as predictable operation [3-5].
3 Simulation Setup
Step:1
For ANC system containing a Secondary Path transfer function S (n) the residual
error can be expressed as:
e(n) = d (n) − y′(n) (1)
where, y ′(n) is the output of the secondary path S (n) .
Step:2
If S (n) be an IIR filter with denominator coefficients [a0 ......an ] and numerator
coefficient [b0 .........b
M −1
] , then the filter output y ′(n) can be written as the sum of the
filter input y (n) and the past filter output:
N M −1
y '(n) = ai y '(n − i ) + b j y (n − j ) (2)
i =1 j =0
Step:3
Hence the gradient estimate becomes:
^
∇ ξ(n) = −2 x '(n)e(n), (3)
N M −1
where, x '( n) = ai x '( n − i ) + b j x(n − j ) (4)
i =1 j =0
In practical applications, S (n) is not exactly known, therefore the parameters ai and
^
b j are the parameters of the Secondary Path Estimate S ( n ) .
Step:4
The weight update equation of the FXLMS algorithm is:
w(n +1) = w(n) +μx '(n)e(n) (5)
Equation (5) is the filter weight update equation of the FXLMS algorithm. As it
can be seen that the equation is identical to the LMS algorithm but here instead of
using x(n) i.e. the reference signal, x′( n) is used which is the filtered output of the
original signal called as “filtered-x signal” or “filtered reference signal”.
The FXLMS algorithm is very tolerant to modelling errors in the secondary path
estimate. The algorithm will converge when the phase error between S (n) and H (n)
is smaller than +/- 90 degrees. Computer simulations show that phase errors less than
45º do not significantly affect performance of the algorithm [8]. The gain applied to
the reference signal by filtering it with S (n) does not affect the stability of the
algorithm and is usually compensated for by modifying the convergence parameter, μ.
Convergence will be slowed down though, when the phase error increases. H (n) is
estimated through a process called system identification. Band-limited white noise is
played through the control speaker(s) and the output is measured at the error sensor.
An Improved Filtered-x Least Mean Square Algorithm for Acoustic Noise Suppression 29
The measured impulse response is obtained as a FIR filter S (n) in the time domain.
The coefficients of S (n) are stored and used to pre-filter the reference signal and give
the input signal to the LMS update.
The performance of the FXLMS adaptation process is dependent on a number of
factors. Those factors include the characteristics of the physical plant to be controlled,
the secondary path impulse response and the acoustic noise band-width. Hence the
FXLMS will not function properly if the secondary path has a long impulse response
and/or the acoustic noise has a wide band-width. Among the design parameters,
steady state performance increases, increasing the filter length but convergence of
FXLMS algorithm is degraded by this. So the filter length should be chosen carefully.
After the filter length is set, the maximum achievable performance is limited by a
scalar parameter called the adaptation step-size. The speed of convergence will
generally be higher when choosing a higher sample frequency and higher value of μ .
[14-16]
It is advantageous to choose a large value of μ because the convergence speed will
increase but too large a value of μ will cause instability. It is experimentally
determined that that the maximum value of the step size that can be used in the
FXLMS algorithm is approximately:
1
μ max = (6)
px ' ( L + Δ )
where px ' = E[ x′2 (n)] is the mean-square value or power of the filtered reference
signal x′(n) , L is the number of weights and Δ is the number of samples
corresponding to the overall delay in the Secondary Path.
After convergence, the filter weights W ( n) vary randomly around the optimal
solution Wopt . This is caused by broadband disturbances, like measurement noise and
impulse noise on the error signal. These disturbances cause a change of the estimated
^
gradient ∇ξ(n) , because it is based only on the instantaneous error. This results in an
average increase of the MSE, that is called the excess MSE, defined as, [9]
^
ξexcess = E[ξ(n)] − ξmin (7)
This excess MSE is directly proportional to step size. It can be concluded that there
is a design trade-off between the convergence performance and the steady-state
performance. A larger value of μ gives faster convergence but gives bigger excess
MSE and vice-versa. Another factor that influences the excess MSE is the number of
weights.
MATLAB platform is chosen for simulation purpose. The room impulse response is
measured for an enclosure of dimension 12×12×8 ft3 using an ordinary loudspeaker
30 A. Kar et al.
The primary and secondary path impulse response with respect to time is presented
in Fig. 2 and Fig. 3 respectively. The first task in active noise control is to estimate
the impulse response of the secondary propagation path. This is usually performed
prior to noise control using a random signal played through the output loudspeaker
while the unwanted noise is not present. Fig.4 shows the true estimation of secondary
path impulse response with the variation in coefficient value for the true, estimated
and error signal with respect to time. The performance of FXLMS algorithm for ANC
can be judged from the discussed results pertaining to an adaptive framework.
5 Conclusion
FXLMS Algorithm is widely used in acoustic noise cancellation environment due to its
simple real time implementation & low computational complexity. Although the
algorithm has slow convergence but various modifications in the existing circuit can be
done to improve the convergence. FXLMS algorithm has greater stability than other
algorithms used in case of acoustic noise cancellation as shown in the paper. There is a
wide range of modification scopes available in case of FXLMS algorithm which can
improve the convergence. FXLMS algorithm can also be used in very high noise
environment.
References
1. Elliot, S.J.: Signal Processing for Active Control. Academic Press, London (2001)
2. Elliot, S.J., Nelson, P.A.: Active noise control. IEEE Signal Processing Magazine 10, 12–
35 (1993)
3. Lueg, P.: Process of silencing sound oscillations. U.S. Patent, 2043416 (1936)
4. Kuo, S.M.: Morgan, D.R.: Active Noise Control Systems-Algorithms and DSP
Implementations. Wiley (1996)
5. Kuo, S.M., Morgan, D.R.: Active Noise Control: A tutorial review. Proceedings of
IEEE 87, 943–973 (1999)
6. Widrow, B., Stearns, S.D.: Adaptive Signal Processing. Prentice Hall, New Jersey (1985)
7. Morgan, D.R.: An analysis of multiple correlation cancellation loops with a filter in the
auxiliary path. IEEE Transactions in Acoustics, Speech, and Signal Processing 28, 454–
467 (1980)
8. Boucher, C.C., Elliot, S.J., Nelson, P.A.: Effects of errors in the plant model on the
performance of algorithms for adaptive feedforward control. IEE Proceedings F138(4),
313–319 (1991)
9. Widrow, B., McCool, J.M., Larimore, M., Johnson, C.R.: Stationary and non-stationary
learning characteristics of the LMS adaptive filter. IEEE Proceedings 64(8), 1151–1162
(1976)
10. Butterweck, H.: A wave theory of long adaptive filters. IEEE Transactions on Circuits and
Systems I:Fundamental Theory and Applications 48(6), 739–747 (2001)
32 A. Kar et al.
11. Warnaka, G.E., Poole, L.A., Tichy, J.: Active acoustic attenuators. U.S. Patent 4473906
(1984)
12. Elliott, S.J., Stothers, I.M., Nelson, P.A.: A multiple error LMS algorithm and its
applications to active control of sound and vibration. IEEE Transactions on Acoustic,
Speech and Signal Processing Processing 35, 1423–1434 (1987)
13. Rupp, M., Sayed, A.H.: Two variants of the FxLMS algorithm. In: IEEE ASSP Workshop
on Applications of Signal Processing to Audio and Acoustics, pp. 123–126 (1995)
14. Rupp, M., Sayed, A.H.: Robust FxLMS algorithms with improved convergence
performance. IEEE Transactions on Speech and Audio Processing 6(1), 78–85 (1998)
15. Davari, P., Hassanpour, H.: A variable step-size FxLMS algorithm for feedforward active
noise control systems based on a new online secondary path modelling technique. In:
IEEE/ACS International Conference on Computer Systems and Applications, pp. 74–81
(2008)
16. Kunchakoori, N., Routray, A., Das, D.: An energy function based fuzzy variable step size
fxlms algorithm for active noise control. In: IEEE Region 10 and the Third International
Conference on Industrial and Information Systems, pp. 1–7 (2008)
17. Eriksson, L., Allie, M., Melton, D., Popovich, S., Laak, T.: Fully adaptive generalized
recursive control system for active acoustic attenuation. In: IEEE International Conference
on Acoustics, Speech, and Signal Processing, vol. 2, pp. II/253–II/256 (1994)
A Unique Low Complexity Parameter Independent
Adaptive Design for Echo Reduction
Abstract. Acoustic echo is one of the most important issues in full duplex
communication. The original speech signal is distorted due to echo. For this
adaptive filtering is used for echo suppression. In this paper our objective is to
cancel out the acoustic echo in a sparse transmission channel. For this purpose
many algorithms have been developed over the period of time, such as Least
Mean Square (LMS), Normalized LMS (NLMS), Proportionate NLMS
(PNLMS) and Improved PNLMS (IPNLMS) algorithm. Of all these algorithms
we carry out a comparative analysis based on various performance parameters
such as Echo Return Loss Enhancement, Mean Square Error and Normalized
Projection Misalignment and find that for the sparse transmission channel all
these algorithm are inefficient. Hence we propose a new algorithm modified -
μ - PNLMS, which has the fastest steady state convergence and is the most
stable among all the existing algorithms, this we show based on the simulation
results obtained.
1 Introduction
Echo is a delayed and distorted version of the original signal. Acoustic echo is mainly
present in mobile phones, Hands free phones, Teleconference or Hearing aid systems
which is caused due to the coupling of microphone & loudspeaker. Echo largely
depends on two parameters: amplitude & time delay of reflected waves. Usually we
consider echoes with appreciable amplitude & larger delays of above 1 ms, but in
certain cases if the generated echo is above 20 ms then it becomes a major issue and
needs to be cancelled. Thus, developers are using the concepts of Digital Signal
Processing for echo cancellation to stop the undesired feedback and allow successful
full duplex communication. [1]
To achieve echo free systems numerous methods have been proposed like echo
absorbers, echo barriers, echo cancellers to name a few. But considering the
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 33
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_5, © Springer International Publishing Switzerland 2014
34 P. Das et al.
advancement of signal processing echo cancellation can be best done by the help of
adaptive filtering. Adaptive filters are widely used due to its stability and wide scope
for improvements. Adaptive filters can also be applied to low frequency noise. Echo
can be suppressed in the most effective manner by continuous adaptation of the
weights of the adaptive filter till the echo is completely cancelled out. System
identification is generally used to generate a replica of the echo that is subtracted from
the received signal. The echo canceller should have a fast convergence speed so that it
can identify and track rapidly the changes in the unknown echo path. The
convergence rate depends on the adaptive algorithm as well as the structure of
adaptive filter used in AEC. In AEC, a signal is generated which is correlated to the
echo signal but opposite in phase. By adding both the signals that is the signal
corrupted by noise and the generated signal which is correlated and opposite phase to
the actual echo signal, original signal can be made echo free. The generation of the
signal is controlled by an adaptive algorithm which adaptively changes the weight of
the filter used. AEC is classified into two categories i.e. “feed-forward AEC” and
“feedback AEC”. In feedback AEC method we use a controller to modify the
response of the system. The controller may be like an addition of artificial damping.
But in feed-forward AEC method a controller is used to adaptively calculate the
signal that cancels the noise. In this paper “feed-forward AEC” approach is
implemented to cancel the echo. This is because the amount of echo reduction
achieved by feed-forward AEC system is more than that of feedback AEC system [2].
Further the nature of the transmission channel in AEC being sparse i.e. only few
weight coefficients are active and all others are zero or tending to zero. The basic
fundamental algorithms like LMS and NLMS suffer from slow convergence speed in
these sparse channels. Hence a new modified algorithm was developed by Duttweiler
named Proportionate NLMS for the sparse transmission channel systems [3]. The
concept behind this is to assign each coefficient a step size proportionate to its estimated
magnitude and to update each coefficient of the adaptive filter independently. But the
main issue with the algorithm is that the convergence speed is reduced excessively after
the initial fast period. A new kind of adaptive filtering process μ PNLMS was proposed
to encounter the aforesaid problem. Here, the logarithm of the magnitude is used as the
step gain of each coefficient instead of magnitude only. So this algorithm can
consistently converge over a long period of time.
In this paper, we propose an algorithm to improve the performance of the
MPNLMS algorithm to a greater extent in the sparse channel. The proposed algorithm
adaptively detects the channel’s sparseness and changes the step size parameter to
improve the ERLE, Mean Square Error (MSE) and Normalized Projection
Misalignment (NPM).
related with the noise that exits in the distorted input signal. The weight update
equation for the LMS algorithm is given by:
In case of LMS algorithm if the step size is too small then the adaptive filter will
take too much time to converge, and if it is too large the adaptive filter becomes
unstable and its output diverges .So it fails to perform well in echo cancellation. The
recursive formula for Normalized Least mean Square (NLMS) algorithm is
x ( n + 1)e(n + 1)
hˆ( n + 1) = hˆ ( n ) + 2 μ (2)
2
|| x ( n + 1) ||2 +δ NLMS
Here δ NLMS is the variance of the input signal x ( n ) which prevents division by
zero during initialization stage when x ( n ) = 0 .Further to maintain stability the step
2
E{| x(n + 1) | }D ( n + 1)}
size should be in the range 0 < μ < 2 .
2
E{| e( n + 1) | }
2 2
Here E{| x ( n ) | } is the power of input signal, E{| e( n) | } is the power of error
signal and D ( n ) is the mean square deviation . The NLMS algorithm though being
efficient does not take into consideration sparse impulse response caused to bulk
delays in the path and hence needs to adapt a relatively long filter. Also unavoidable
noise adaptation occurs at the inactive region of the filter. To avoid this we need to
use sparse algorithms, where adaptive step size are calculated from the last estimate
of the filter coefficients in such a way that the step size is proportional to step size of
filter coefficients. So active coefficients converge faster than non-active ones and
overall convergence time gets reduced. [4]
B. PNLMS
In order to track the sparseness measure faster PNLMS algorithm was developed from
NLMS equation. Here the filter coefficient update equation is different from the
NLMS in having a step size update matrix Q, with rest of the terms remaining same
and is given below as:
Q ( n ) x ( n + 1)e(n + 1)
hˆ ( n + 1) = hˆ ( n ) + μ (3)
2
|| x ( n + 1) ||2 +δ PNLMS
δ = δ NLMS
Here, PNLMS , and the diagonal matrix Q (n) is given by,
L
C. IPNLMS:
1−α | hl ( n ) |
as kl ( n ) = + (1 + α ) (5)
2L 2 || h ( n ) ||1 +ε
here ε is a small positive constant .Also results so that good choice for α are 0.-0.5,-
0.75. The regularization parameter should be taken such so that same steady state
misalignment is achieved compared to that of NLMS using same step size. [6] We
have
1−α
δ IPNLMS = δ (6)
2 L NLMS
D. MPNLMS:
This algorithm is an efficient one for a sparse transmission channel and has a steady
convergence over a period of time. Unlike the previously discussed proportionate
algorithms which have a faster convergence during the initial period but slows down
over a period of time. In this algorithm we calculate the step size proportionate to the
logarithmic magnitude of the filter coefficients. [7]
3 Proposed Algorithm
Among all the algorithms discussed above for a sparse channel, which is our
transmission channel of interest, the μ -PNLMS algorithm has the fastest
convergence. The existing classical NLMS algorithm’s filter weight coefficients
converge slowly in such a channel. Hence PNLMS and IPNLMS algorithms were
designed specifically for echo cancellation in sparse transmission channels, but after
A Unique Low Complexity Parameter Independent Adaptive Design for Echo Reduction 37
initial fast convergence of filter weight vectors, these algorithms too fail.[6]-[7] So we
propose μ -law PNLMS algorithm for a sparse channel. Here instead of using filter
weight magnitudes directly, its logarithm is used as step gain of each coefficient.
Hence μ -PNLMS algorithm can converge to a steady state effectively for a sparse
channel. This algorithm calculates the optimal proportionate weight size in order to
achieve fastest convergence during the whole adaptation process until the adaptive
filter reaches its steady state.
Implementation:
Step 1: The weight update equation for the proposed algorithm is given as:
ˆ n + 1) = hˆ ( n ) + μ Q ( n ) x ( n + 1) e (n + 1) (7)
h(
2
|| x ( n + 1) ||2 + δ MPNLMS
δ MPNLMS = δ NLMS
where; (8)
L
Step 2: The control matrix ql ( n) which is the matrix containing the diagonal elements
of the Q ( n ) is given as,
kl ( n)
ql ( n ) = (9)
iL=−01 ki ( n )
1
L
Step 3: Finally the value of kl ( n) , which is the main differentiating factor in case of
the proposed algorithm is given as:
kl (n) = max{ρ × max{γ , F(| hˆ 0 (n) |)..........F(| hˆ l −1(n) |)}, F (| hˆ l (n) |)} (10)
ln(1 + μ | hl ( n ) |)
ˆ
Here, F (| hˆ l ( n ) |) = , (11)
ln(1 + μ )
| hˆ L ( n ) |≤ 1 and μ = 1 / ε . Constant 1 is taken inside the algorithm to avoid infinity
value when the value of | hˆ l ( n ) | =0 initially. The denominator ln(1 + μ ) normalizes
F (| hˆ L ( n ) |) in the range [0, 1].The value of ε should be chosen based on the noise
level and usually it is taken as 0.001.
MATLAB platform is chosen for simulation purpose. The room impulse response is
measured for an enclosure of dimension 12×12×8 ft3 using an ordinary loudspeaker
and an uni-dynamic microphone kept at a distance of 1 ft. which has been sampled at
8 KHZ. The channel response is obtained by applying a stationary Gaussian stochastic
signal with zero mean and unit variance as input on the measured impulse response.
38 P. Das et al.
Fig. 1 shows the MATLAB simulation of acoustic echo cancellation using the
proposed algorithm. Fig.1 has three separate graphs where the first is the original
noise present in the system ,the second one is the graph of original signal along with
the noise signal and the third graph is the original signal recovered using the proposed
algorithm.
5 Conclusion
In this paper we do a detailed study of the existing algorithms for acoustic echo
cancellation and find out that only proportionate algorithms are capable of acoustic
echo cancellation in sparse channel. But among the existing proportionate algorithms
PNLMS, IPNLMS though have initial fast convergence they fail to adapt as time
progresses [7]. And only μ − PNLMS algorithm has a good steady state convergence.
Now to further improve the performance of μ − PNLMS algorithm we propose our
own algorithm which we have shown to have the most efficient steady state
convergence for transmission channels having sparse impulse response. This we
ascertain based on our simulation results, where we take into consideration various
performance measurement parameters like ERLE and NPM.We have obtained that
our algorithm gives the maximum ERLE amongst all the existing algorithm and also
the least NPM value. So we can safely conclude that our proposed algorithm is the
most efficient one for acoustic echo cancellation in a sparse medium.
References
1. Verhoeckx, N.A.M.: Digital echo cancellation for base-band data transmission. IEEE Trans.
Acoustic, Speech, Signal Processing 27(6), 768–781 (1979)
2. Haykin, S.: Adaptive Filter Theory, 2nd edn. Prentice-Hall Inc., New Jersey (1991)
3. Khong, A.W.H., Naylor, P.A., Benesty, J.: A low delay and fast converging improved
proportionate algorithm for sparse system identification. EURASIP Journal of Audio
Speech Music Processing 2007(1) (2007)
4. Diniz, P.S.R.: Adaptive Filtering, Algorithms and Practical Implementation. Kluwer
Academic Publishers, Boston (1997)
40 P. Das et al.
5. Wang, X., Shen, T., Wang, W.: An Approach for Echo Cancellation System Based on
Improved NLMS Algorithm. School of Information Science and Technology, pp. 2853–
2856. Beijing Institute of Technology, China (2007)
6. Paleologu, C., Benesty, J., Ciochin, S.: An Improved Proportionate NLMS Algorithm based
on the Norm. In: International Conference on Acoustics, Speech and Signal Processing, pp.
309–312 (2010)
7. Deng, H., Doroslovachi, M.: Proportionate adaptive algorithms for network echo
cancellation. IEEE Trans. Signal Processing 54(5), 1794–1803 (2006)
On the Dissimilarity of Orthogonal Least Squares
and Orthogonal Matching Pursuit Compressive Sensing
Reconstruction
1 Introduction
Compressed sensing is an innovative theory in the field of signal processing that aims
to reconstruct signals and images from what was previously treated as incomplete
information. It uses relatively few measurements than those were needed in the
Shannon Nyquist theory of signal reconstruction. This theory comes from the fact that
most of the signal information is carried by only few of coefficients and all other are
get wasted while reconstructing that signal [1]. Then it gives a way to collect only
those components in the first stage that adds up in signal recovery. The practical need
of this study comes with the shortage of storage space for increasing data transferred
from here to there.
Compressed sensing has been used in many fields like medical imaging, Radar,
Astronomy, speech processing where the task is to reconstruct a signal. For example,
in medical imaging CT (computerized Tomography), by exposing x-rays one can
generate image of inside of human body. For the complete scanning process, patient is
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 41
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_6, © Springer International Publishing Switzerland 2014
42 A. Kaur and S. Budhiraja
exposed to radiations for a large span of time, which are harmful for him. By using
compressive sensing, only the desired samples are to be taken, so scanning time
reduces to a great extent by decreasing the number of samples to be taken. The
mechanism of reconstruction using sparse matrix is shown in Fig. 1.
Where Y is the reconstructed signal and A' is the random measurement matrix.
Gaussian and Bernoulli matrices are mostly used as random measurement matrices.
For Gaussian matrices the elements are chosen randomly as independent and
identically distributed. These elements can have the variance as from [3]
Variance=1/K
For the Bernoulli matrices [3] with equal probability each element takes value of
1/√K or +1/√K
For reconstruction using compressed sensing two type of algorithm are used basis
pursuit and greedy algorithms. Basis pursuit algorithms are quite simpler to
implement as compared to greedy pursuit. The greedy algorithms generally used are
orthogonal matching pursuit and orthogonal least squares give faster recovery.
Generally these two algorithms are taken as same but there is certain difference in
them [8]. For the greedy algorithms, the projection on selected signal elements is
orthogonal. OLS selects the entity that minimizes the residual norm after signal’s
projection onto the selected entities [4]. It has been proved by Thomas Blumensath
and Mike E. Davies in [4]. This particular entry is the one which best approximates
the current residual. This marks the difference of OLS from OMP. Generally OLS is
little bit more complex than OMP, but due to the same output results given by two
algorithms, both are taken as same by the researchers. There had been previous work
on clarification between two algorithms, but most of it is based on theoretical
concepts. In this paper the confusion on the similarity part of both algorithms is
removed by showing the simulation results of both the algorithms for same results.
The reconstruction time taken by both is also considered.
literature by Huang and Rebollo that reconstruction time can be reduced to a great
extent by using this particular algorithm [13], [15]. Suppose that the column vector of
A are normalized such that
|| Ai ||2=1, for i=1,2,…k
A(x) be a subset of A for x C {1,2,..,k}. To start the algorithm, signal support is
needed to be calculate from pseudo inverse A` of the measurement matrices A, as
P= (A) `Y
Where A’ is the complex conjugate of measurement matrices A and we know that
A`= (A*A)-1A*
During the implementation a matching operation is performed between the matrix
A entries and the calculated residuals of P [2, 14]. At the end of all iterations, the
complete signal is generated. The implemented algorithm includes following steps
1. Initialize the residual of Y as s0 = y
2. Initialize the set of entries that are to be selected as A(c)= null
3. Start an iteration counter. Let i=1
4. Generate an estimate of given signal per iteration.
5. Solve the maximization problem
max| Aj , si-1 |
In which Aj is any variable.
6. To A(c), the set of selected variables, add Aj , and update c after every result.
7. It is to be noted that the algorithm does not minimize the residual error after its
orthogonal projection to the values.
8. Update result after the projection. The entry selected have a minimum error as
rn=S- Sn
9. One coordinate for signal support for P is calculated at the end after computing
the residual.
10. Set i=i+1, to end the iterations of algorithm.
jmax=||A- (A *i A i)-1A||2
Pn= A i-1A
Thus the signal is reconstructed from the inverse problem solution [10].
The focus of this paper is to highlight difference between the two mentioned greedy
algorithms. An estimate for matrix A is calculated and its columns are used to
calculate the inner products required to select elements from signal. When the number
of elements to be selected is quite smaller than the number of column vectors of
matrix A; OLS is complex because all elements need to be orthogonalized [4].
Fig. 2. (a) Original randomly generated signal (b) original signal reconstructed using
Orthogonal Least Squares
On the Dissimilarity of OLS and OMP Compressive Sensing Reconstruction 45
Fig. 3. (a) Original randomly generated signal (b) original signal reconstructed using
Orthogonal Matching Pursuit
It has been proved previously that the greedy algorithms are faster than the basis
pursuit compressive sensing reconstruction. This is shown in Fig.2(a),(b) and
Fig.3(a),(b) A comparison of elapsed time for reconstruction and for PSNR for the
output signal is shown in Table 1. A comparison of both the algorithms is done with
Justin Romberg’s Basis pursuit’s algorithm [11].
Table 1. Comparison of elapsed time and PSNR for both the algorithms
In Table 1, the reconstruction time is given in seconds and is calculated from the
beginning of the generation of sparse matrix up to the generation of output signal. A
slight difference in PSNR is observed for the same signal recovery by using the two
algorithms, which marks the difference between the two algorithms, i.e. proves that
both are not same. The reconstruction results are obtained by implementing the
greedy algorithms in a multi stage manner, with less number of dimensions. The
PSNR obtained is higher than the L-norms or the basis pursuit method of compressive
sensing reconstruction. It is clear from the table that greedy algorithms are faster than
the basis pursuit algorithm.
5 Conclusions
The results from this paper demonstrate that both OLS and OMP algorithms gives
same recovery results but there is a significant difference between the two in terms of
46 A. Kaur and S. Budhiraja
minimizing the residual and projections. It is not justified to use one in place of
another. The simulation results obtained are compared on the basis of PSNR and
elapsed time of reconstruction. Results show that both OLS and OMP give an
impressive way of reconstruction over the L-norms or basis pursuit compressive
sensing reconstruction.
References
1. Donoho, D.L.: Compressed sensing. Stanford University Department of Statistics
Technical Report (2004-2005)
2. Fornasier, M., Rauhut, H.: Compressive sensing. IEEE Transactions on Information
Theory (2010)
3. Donoho, D.: Compressed sensing. IEEE Transactions on Information Theory 52(4), 1289–
1306 (2006)
4. Blumensath, T., Davies, M.E.: On the difference between orthogonal matching pursuit and
orthogonal least squares (2007)
5. Gillbert, T.: Signal recovery from random measurements via orthogonal matching pursuit.
IEEE Transactions on Information Theory 53(12) (2007)
6. Beck, T.M.: Fast gradient based algorithms for constrained total variation image denoising
and deblurring problems. IEEE Transactions on Image Processing 18, 2419–2434
7. Ehler, M., Fornasier, M., Sigl, J.: Quasi-Linear Compressed Sensing
8. Soussen, C., Gribnovel, R.: Joint k-step analysis of orthogonal matching pursuit and
orthogonal least squares. IEEE Transactions on Information Theory 59(5)
9. Vaswani, N.: LS-CS-Residual (LS-CS): Compressive Sensing on Least Squares Residual.
IEEE Transactions on Signal Processing 58(8) (2010)
10. Vehkapera, M., Kabashima, Y., Chatterjee, S.: Analysis of Regularized LS Reconstruction
and Random Matrix Ensembles in Compressed Sensing
11. Candes, E., Romberg, J.: Sparsity and incoherence in compressive sampling. Inverse
Problems 23(3), 969–985 (2007)
12. Gribonval, S., Herzet, I.: Sparse recovery conditions for orthogonal least squares. In: HAL
2 (2011)
13. Gharavi, H.T.S.: A fast orthogonal matching pursuit algorithm. In: IEEE International
Conference on Acoustics, Speech and Signal Processing, vol. 3 (1998)
14. Kaur, A., Budhiraja, S.: In: Sparse signal reconstruction via orthogonal least squares. In:
IEEE International Conference on ACCT (2014)
15. Rebollo-Neira, L., Lowe, D.: Optimized orthogonal matching pursuit approach. IEEE
Signal Processing Letters 9(4) (2002)
Machine Learning Based Shape Classification
Using Tactile Sensor Array
1 Introduction
Humans interact with and explore the environment through five main senses viz.
touch, vision, hearing, olfaction and taste. Exploiting one or a combination of these
senses, humans discover new and unstructured environments. Touch or tactile sensors
are an efficient way to replicate the human touch sensation which includes different
parameters such as texture, shape, softness, shear and normal forces etc. Development
of the artificial touch sensation system can broadly be categorized into four parts -
Development of contact shape and size detection system, Development of
hardness/rigidity analysis system, Development of texture analysis system and bio-
mimicking of human exploration. This paper deals with the first part of tactile 3D data
mapping i.e. shape classification using tactile sensor array by machine learning
algorithms.
2 Related Work
A comprehensive review of the tactile sensing and its potential applications was done
by Tiwana Mohsin et al. [1] in 2012 which covered the significant works in Tactile
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 47
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_7, © Springer International Publishing Switzerland 2014
48 D. Babu et al.
sensing based system design. Several results have been reported in the literature for
contact shape classification. Primitive shape classification using tactile sensor array
has been implemented with naïve Bayes [2] in the domain of Robotics in which the
authors extracted structural nature of the object from its pressure map using
covariance between pressure values and the coordinates of the map in which the
values occur. In [2], Liu et al. used neural networks to classify shapes based on 512
feature vector derived from the pressure map [3]. Further works reported by
Tapamayukh et al. [4] and Pezzementi et al. [5] discusses a robotic gripper which
used tactile contact shape as a feature for object classification whereas in [6]
recognition and localization of 2D shapes with minimal tactile data is analysed in an
analytical perspective. Seiji Aoyoga [7] used a neural network model for shape
classification into 8 shape categories with the data from arrayed capacitive tactile
sensor. The issues of tactile array development and shape estimation methodologies
are detailed in the survey conducted by Dahiya et al. [8].
Shape classification using machine learning techniques requires a set of geometric
features to be extracted and transformed (into Fourier descriptors, Autoregressive
models, Features) before feeding the training data to the machine learning algorithm
[9], to learn relevant distinguishing features of a shape rather than taking the entire set
of pixel intensities. The techniques employed to represent and classify shapes can be
broadly divided into contour-based or region-based [10], depending on whether the
attributes related to shape was derived from its boundary or the entire area covered by
the shape. Contour based and region based techniques are further sub-divided into
global and structural techniques, based on whether the entire contour or region was
taken into account or subsections (“segments”) of the same were used.
In this work we focus on the challenges posed when two dimensional pressure
profile data needs to processed as a digital image (including removal of outliers and
applying appropriate thresholding to convert the pressure values to a black and white
binary image) to identify shapes that lie under the sensor area. The processed digital
image is further classified using a learning system in order to cater for irregularities
associated the translation and rotation. The rest of the paper is organised as follows
Section 3 elaborates the methodology of approach, Section 4 details the experimental
setup. Results and interpretations are analysed in Section 5 and finally the paper
concludes in Section 6.
3 Methodology
The architecture of the system uses a 2 D capacitive tactile array with 256 elements
arranged in a square shape. The signal excitation from the sensors are fed to a signal
processing block which simultaneously processes the output signals and converts it to
analog signal of 0-5V signal level with a subsequent analog to digital converter
which converts the analog signal to 8 bit digital data for further data processing. The
tactile array used for the proposed system was developed by Pressure profile systems
with pressure range of 0-60kPa having minimum sensitivity of 0.8 kPa. The 16×16
tactile array data is further used for contact shape recognition and classification. The
block schematic of the methodology is shown in Fig. 1.
Machine Learning Based Shape Classification Using Tactile Sensor Array 49
The feature vector used for contact shape recognition and classification is obtained
using the region based and contour based structural properties of the shapes to be
classified on the normalized and preprocessed taxel contact pattern. The different
elements of feature vectors [12] are:
Average distance between the centroid of the contact region with the edge points.
The centroid/centre of gravity of the shape is obtained by Eq. (2) and (3) where x and
y are the taxel positions with maximum being n and m respectively , l(x,y) is the taxel
intensity at position (x,y) and (Xc ,Yc) is the centroid position.
n m n m
Xc = l ( x, y ) x / l ( x, y ) (2)
x =1 y =1 x =1 y =1
n m n m
Yc = l ( x, y ) y / l ( x, y ) (3)
x =1 y =1 x =1 y =1
and the average centroidal distance is obtained by equation (4) where (Xi,Yi) is the
location of the pixels across the shape contour.
c
Avgdist = (( Xc − Xi ) 2 + (Yc − Yi ) 2 ) (4)
i =1
For different shapes the variations of the contour edges with the average radius
gives us the second moment of the contact shape. The average and maximum
variance of the contour points with the average radial distance can be used as a
structural feature for object classification which is used for analytical distinguishing
given by the following equations.
c
P var = ( Avgdist − ( Xc − Xi ) 2 + (Yc − Yi ) 2 )) (5)
i =1
The eccentricity of the foreground image-the ratio between the length of major
and minor axes invariant of the rotation and translation of the contact shape.
Ratio of actual area of the contact taxels to the bounding box area is another
parameter of the feature vector. The bounding box is the minimum rectangular region
that covers the entire contact region thus has max value for square and the least being
for ring.
The ratio of perimeter^2 to that of area gives the information on the solidity of
the contact surface and thereby gives a region specific parameter rather than a contour
based approach [13].
strategies to classify the shapes [14]. The reason for choosing these algorithms is that
they are relatively easy to analyse and should be sufficient to solve the problem of
classifying regular shapes, given the fact that we have already identified and extracted
the translation, rotation and scale invariant features of the shape.
Decision trees are logic based classifiers and work by sorting features values to
classify instances. Nodes in trees represent various features of the instance and the
edges of the tree denote the classes of values the nodes can assume. The root for every
sub-tree is chosen such that features divide the data in best possible way. On the other
hand Naïve Bayes networks is a statistical learning algorithm which generates
directed acyclic graphs such that there is only one parent (the feature to be classified)
and several children consisting of observed features. It is assumed here that the child
nodes are not dependent on one another [15].
There are various strategies to combine classifiers to achieve better results; Voting
being one of them [16]. In this paper we have used a very simple voting mechanism
that combines the predictions of two or more classifiers using average of probabilities.
The final prediction of the voting classifier is calculated at run time after the models
of candidate algorithms have been generated.
4 Experimental Set Up
The basic shapes into which the objects are classified are square, triangle, circle and
ring. The objects are carved out of wood. During data collection, the objects are
placed in different orientations and positions on the tactile sensor and 200 gm weight
is placed on them. Since we are interested in classifying two dimensional shapes only,
we have placed the sensor array on a flat wooden table (as opposed to placing the
sensor array on a sponge to get surface contour information). Appropriate class labels
are attached to each record in the processed data derived after image processing. We
have collected completely separate records with same objects (varying orientation and
position) for training and testing. In this paper we have reported the percentage of
correctly classified instances, model building time and the confusion matrices for the
separate test cases. All experiments were performed on a Pentium 4 processor 2.7GHz
with 1 GB RAM. Fig. 2 (a) shows the conformable tactile array sensor from Pressure
Profile system while Fig. 2 (b) gives the wooden pieces of sample shapes with
different sizes used for the experiment. WEKA learning system [17] is used for the
initial training and testing before real time implementation.
Fig. 2.(a) Conformable tactile sensor array; Fig. 2.(b) Sample shapes and weight used
for experimentation
52 D. Babu et al.
The raw tactile data obtained from tactile array is fed to the Mathworks environment
for image enhancement and feature extraction which is further fed to the WEKA with
corresponding class of shape. Fig. 3 shows the different contact patterns for different
objects at different stages of image enhancement before feature extraction.
Fig. 3. (a) 8 bit tactile image of square object; (b) Tactile image after convolution and
thresholding ; (c) Final image after morphological operations
The enhanced image gives an eroded and dilated image with the single largest
connected element in tactile image In this section we have reported the percentage of
correctly classified instances, relative absolute error and confusion matrix for C4.5,
naive Bayes and Voting for 66% split (of entire data set for training and rest for testing)
as well as with 10 fold cross-validation. The Table 1 shows the confusion matrix for
C4.5 algorithm for classifying square (a), triangle (b), circle (c) and ring (d).
Table 1. Confusion Matrix for separate test dataset for (a) C4.5 classifier (b) Naive Bayes
classifier (c) voting classifier
a b c d a b c d a b c d
8 0 0 0 a 1 7 0 0 a 8 0 0 0 a
1 8 0 1 b 0 10 0 0 b 0 9 0 1 b
0 0 5 0 c 0 0 5 0 c 0 0 5 0 c
0 0 0 4 d 0 0 0 4 d 0 0 0 4 d
(a) (b) (c)
Table 1(a) indicates that C4.5 can correctly classify square, circle and ring, but not
triangles. C4.5 has an accuracy of 92 % for correctly classifying instances and a
relative absolute error of 12 %. On the other hand the confusion matrix for naïve
Bayes classifier indicates that the classifier correctly classifies triangle but not square
as shown in the Table 1 (b). The naïve Bayes classifier has an accuracy of 74 % with
relative absolute error of 34 %. This led us to apply the voting strategy using C4.5 and
the naïve Bayes classifier and applying average probabilities for class predictions.
Table 1(c) shows the confusion matrix for voting classifier.
Machine Learning Based Shape Classification Using Tactile Sensor Array 53
Table 2. Confusion Matrix for for 10 fold Cross validation for (a) C4.5 classifier (b) Naive
bayes classifier (c) voting classifier
A b c d a b c d a b c d
17 4 0 0 a 7 14 0 0 a 17 4 0 0 a
7 20 0 1 b 6 22 0 0 b 5 22 0 1 b
0 0 18 0 c 2 0 16 0 c 0 0 18 0 c
0 0 0 12 d 0 0 0 12 b 0 0 0 12 d
In this paper we have attempted to solve the problem of shape classification (circle,
square, triangle and ring) using tactile sensors, with the help of feature extraction and
machine learning algorithms. The methodology comprises of capturing pressure
profile of an object and storing it into a 16 by 16 array corresponding to the 256
tactile elements. This array is then processed as an image and converted into black-
and-white monochromatic image using thresholding and subsequently, unwanted
noise is removed from the image using erosion and dilation. Distinguishing features
are then extracted from the image using geometric operations on the data. We have
chosen features which are independent of scaling, translation and rotation. Moreover,
the feature extraction step has successfully reduced the size of the dataset from 256
attributes for 79 instances to 5 attributes for the same.
The dataset consisting of these features have been used as input for C4.5 and naive
Bayes algorithms for shape classification. A voting strategy whereby a combination
54 D. Babu et al.
References
1. Tiwana, M., Redmond, S., Lovell, N.: A review of tactile sensing technologies with
applications in biomedical engineering. Sensors and Actuators A: Physical 179, 17–31
(2012)
2. Liu, H., Song, X., Nanayakkara, T., Seneviratne, L., Althoefer, K.: A computationally fast
algorithm for local contact shape and pose classification using a tactile array sensor. In:
Proceedings of International Conference on Robotics and Automation (ICRA), pp. 1410–
1415 (2012)
3. Liu, H., Greco, J., Song, X., Bimbo, J., Seneviratne, L., Althoefer, K.: Tactile image based
contact shape recognition using neural network. In: Multisensor Fusion and Integration for
Intelligent Systems (MFI), pp. 138–143 (2012)
4. Bhattacharjee, T., Rehg, J., Kemp, C.: Haptic classification and recognition of objects
using a tactile sensing forearm. In: Intelligent Robots and Systems (IROS), pp. 4090–4097
(2012)
5. Pezzementi, Z., Plaku, E., Reyda, C., Hager, G.: Tactile-object recognition from
appearance information. Robotics 27, 473–487 (2011)
6. Ibrayev, R., Jia, Y.: Tactile recognition of algebraic shapes using differential invariants.
Robotics and Automation 2, 1548–1553 (2004)
7. Aoyagi, S., Tanaka, T., Minami, M.: Recognition of Contact State of Four Layers Arrayed
Type Tactile Sensor by using Neural Network. In: Information Acquisition, pp. 393–397
(2006)
8. Dahiya, R., Metta, G., Valle, M., Sandini, G.: Tactile sensing—from humans to
humanoids. Robotics 26, 1–20 (2010)
9. Kauppinen, H., Seppanen, T., Pietikainen, M.: An experimental comparison of
autoregressive and Fourier-based descriptors in 2D shape classification. IEEE Transactions
in Pattern Analysis and Machine Intelligence 17, 201–207 (1995)
Machine Learning Based Shape Classification Using Tactile Sensor Array 55
10. Zhang, D., Lu, G.: Review of shape representation and description techniques. Pattern
Recognition 37, 1–19 (2004)
11. Gonzalez, R., Woods, R., Eddins, S.: Digital image processing using MATLAB, 2nd edn.
Gatesmark Publishing, Knoxville (2009)
12. Peura, M., Iivarinen, J.: Efficiency of simple shape descriptors. In: Aspects of Visual
Form, pp. 443–451 (1997)
13. Lu, G., Sajjanhar, A.: Region-based shape representation and similarity measure suitable
for content-based image retrieval. Multimedia Systems 7, 165–174 (1999)
14. Quinlan, J.: C4. 5: Programs for Machine Learning 1 (1993)
15. Good, I.: Probability and the Weighing of Evidence, London (1950)
16. Lim, T., Loh, W., Shih, Y.: A comparison of prediction accuracy, complexity, and training
time of thirty-three old and new classification algorithms. Machine Learning 40, 203–228
(2000)
17. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA
Data Mining Software: An Update. ACM SIGKDD Explorations Newsletter 11, 10–18
(2009)
Multi-view Ensemble Learning for Poem Data
Classification Using SentiWordNet
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 57
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_8, © Springer International Publishing Switzerland 2014
58 V. Kumar and S. Minz
learning, where Canonical correlation analysis (CCA) [26] and Co-training [27] are
two common representative techniques. Multi-view learning is strongly connected to
other machine learning topics such as active learning, ensemble learning, and domain
adoption [24]. It has been applied in supervised learning [33], semi-supervised learn-
ing [34], ensemble learning [35], active learning [36], transfer learning [37] and clus-
tering [38] and dimensionality reduction [39]. Many applications of Multi-view learn-
ing for text document classification are mentioned in literature [6- 8].
Poems have short textual paragraphs to express the emotion. The feelings of the
poet are expressed by word sentiments. Lexicons have been used to extract the senti-
ment information of the individual words. Sentiment information of the document can
consider as document features to enhance the classification task. SentiWordNet [2]
has sentiment information of the English words. Sentiment information comprises of
numeric scores (positive and negative) of corresponds the words. Therefore, it has
been utilized to extract the document sentiment information. Many applications of the
SentiWordNet has been done in literature to extract the sentiment information for the
textual documents [3-5]. The work in this paper has two objectives, the use of
document sentiment information as document features, which is extracted from
SentiWordNet and application of multi-view ensemble learning for the poem data to
enhance the classification task.
This paper is organized as follows: Section 2 briefs outline of related work. In Sec-
tion-3, preprocessing has been presented. Classification using Multi-view Ensemble
Learning has been described in Section 4. Experimental Setup and their results are
described in Section 5. Analysis of the results is described in Section 6. Finally, the
conclusion of this research has mentioned in the Section 7.
2 Related Work
Many text classification algorithms have been proposed in the literature to extract the
knowledge from unstructured data. One of the applications of the text classification
algorithms is used to categorize emails [9], patent [10], magazine articles [11], news
[12] etc. To classify text document, many machine learning algorithms have been
applied such as k-nearest neighbor classifiers [13], Bayesian classifiers [14], Regres-
sion models [15], and Neural networks [16]. Malay poetry has been classified pantun
by theme and poetry & none-poetry as well [17] using support vector machine (Radial
Basic Function and linear kernel function). In different case studies, the various
measures of poetry are introduced using fuzzy logic, Bayesian approach and decision
[18]. It is important to identify which machine learning techniques are appropriate for
the poem data. SVM classifier has performed best among the K-nearest neighbor
(KNN) and Naïve Bayesian (NB) classifiers [19].
Wei and Pal [28] and Wan et al. [29] have proposed domain adaptation techniques.
To solve the cross-language text classification problem, the target language has origi-
nal documents which are included by the target domain and the source languages have
the translated documents that are included by source domain. The different views of
the original document can be seen in different language. Multi-view majority voting
Multi-view Ensemble Learning for Poem Data Classification Using SentiWordNet 59
[30], co-training [31], and multi-view co-classification [32] have been designed and
successfully applied. Multiple views of the text data have been proposed for learning
purpose using multi-view learning [8], [20-21].
The mood of the lyrics has been classified using SentiWordNet by adding supple-
mentary sentiment features of the lyrics documents [22]. SentiWordNet has been
applied to sentiment classification of reviews [3], financial news [4], Multi-domain
sentiment analysis [23], and multilingual analysis. A domain independent sentiment
analysis method using SentiWordNet has been proposed. A method is also proposed
to classify subjective sentence and objective sentence from reviews and blog com-
ments [5]. According to the current research and information related to the poem,
analysis of poem data with multiple views having sentiment information of the docu-
ment has not done for classification purpose. Therefore, this paper investigates senti-
ment information of the document as a document features and multi-view learning for
classification task of the poem data.
3 Preprocessing
= … (1)
( ) ( ) ( ) … ( )= ( ) (2)
=1
where, may represent an ensemble function to the combine the predictive models
( ) using each of the views, say for = 1,2,3, … , .
In the literature, ensemble by vote or ensemble by weight has been commonly
cited. Therefore, in this paper ensemble by weight has been considered. It can be ma-
thematically represented as in eq. 3.
( )= ∈ ( ) (∑ ( ( ), )) (3)
Multi-view Ensemble Learning for Poem Data Classification Using SentiWordNet 61
0, ( )=
, ( ) = (5)
1, ( )
Dataset
Creation
View-1 View-2 View-3 View-k
View
( ) ( ) ( ) … ( )
…
( ) ( ) ( ) ( )
Classifier
Model-
( , ) ( , ) ( , ) … ( , )
A single unit is assigned by the loss function for each misclassification. Therefore,
accuracy of the classification can be defined as in eq. 6:
( )=1 ∑| |
, ( ) (6)
| |
Eq. (6) shows that the minimization of loss leads maximum classification accuracy.
Generally, in multi-view ensemble learning, optimal classification accuracy depends
62 V. Kumar and S. Minz
upon three factors: 1) Feature set partitioning criteria, 2) Ensemble method, and
3) Classifier. Therefore, for a specific problem, the empirical study may help in
selection of best feature set partitioning criteria, ensemble method and the classifica-
tion algorithms.
In [19], a dataset consisting of four hundred poems collected from various websites
over the internet. The poems belong to the one of the eight classes namely alone,
childhood, God, hope, love, success, valentine and world. Each class is represented by
fifty poem documents. The preprocessing of the test result is 3707 features. Further
sentiment feature measures in terms of PosNeg and PosNeg_Ratio are extracted for
each of the four hundred text documents. The SVM-based classifiers yielded the best
results for the same poem dataset [19]. To explore multi-view ensemble learning the
experiments in this paper the experiments have been carried out with the SVM for the
single, 2, 3, 4 and 5-views of the poem dataset. The corresponding classification mod-
els have been termed S1, S2, S3, S4 and S5. The 10-fold cross validation has been
carried out with stratified sampling over 1000 iteration to estimate the performance of
the proposed ensemble classifiers. The average of the classification accuracy observed
over the 1000 iterations been presented for analysis.
To study the effect of the sentiment features positive score, negative score and the
ratio of the positive and negative scores for each document the experiments have been
designed using the dataset without the sentiment features namely, PosNeg_without,
and with sentiment information namely, PosNeg and PosNeg_Ratio. The accuracy of
the ensemble classifiers ( ), 1 ≤ ≤ 5, corresponding each of the multi-view
samples S1, S2, S3, S4 and S5 has been presented in Fig. 2. For each of the multi-
view samples of the dataset the classifier accuracy has also been observed for the
classifiers without including the sentiment information and with sentiment informa-
tion. For instance the ensemble learning using the classifiers induced from the 3-view
dataset has been denoted by S3 and the accuracy of the three classifiers is presented
by the bar chart corresponding S3.
In order to facilitate the feature-set partitioning the attribute relevance analysis us-
ing gain-ratio was performed to rank the features. Single-view classifiers using the
including the sentiment information and the top 10%, 20%, 30%… 100% have been
induced using the SVM. The performance of all the ten classifiers has been shown in
the Fig. 3. Since the performance of the classifier using the sentiment information and
the top 20%attributes is marginally less than the classifier using the sentiment infor-
mation and the top 30%attributes, therefore the experiments using the five multi-
views ensemble classifiers has been performed on the dataset with respect to the 20%
of the top ranked attributes. The performance comparison of these classifiers has been
exhibited in the Fig. 4.
Multi-view Ensemble Learning for Poem Data Classification Using SentiWordNet 63
Accuracy (%)
90
80
70
S1 S2 S3 S4 S5
Views of the Peom Data
PosNeg_Without PosNeg PosNeg_Ratio
90
80
70
10 20 30 40 50 60 70 80 90 100
PosNeg_Without PosNeg PosNeg_Ratio
90
80
70
S1 S2 S3 S4 S5
View of Poem data
PosNeg_Without PosNeg_Ratio
Fig. 4. Ensemble multi-view learning performance of the classifier using 20% top ranked
attribute
64 V. Kumar and S. Minz
6 Analysis
The objective of this research was to use document sentiment information for enhanc-
ing the classification performance and application of multi-view learning for the poem
data classification. In Fig. 2, the classification accuracy of all the multi-view classifi-
ers, S1, S2, S3, S4 and S5 it may be observed that the accuracy of the classifier in-
cluding the positive and negative scores, namely PosNeg model is least. On the other
hand all the multi-view classifiers that include the ratio of the sentiment scores per-
form best for the poem dataset. From Fig. 3 it is observed that the performance of the
classifiers using 20% of the top ranked attributes is better than classifier using top
10%attributes. Secondly, the performance of the classifier based on top 20% is mar-
ginally less than the accuracy of the classifier using the top 30% attributes. Further no
improvement in the accuracies of the classifier using top 30%, 40% up to 100%
attribute has been observed. The accuracy of the classifiers using the ratio of the posi-
tive and negative scores is higher than the classifier without sentiment features and
the classifier using the positive and negative scores. This observation may be utilized
as partitioning criterion and dimension reduction if needed.
From the Fig. 4, it is observed that PosNeg_Ratio sentiment feature enhances the
performance of all the five multi-view ensemble classifiers as compared to the five
corresponding multi-view classifiers without the sentiment features. Although the
accuracy of the 2-view ensemble classifier is less than the single-view classifier and
the 3, 4 and 5-view classifiers, the performance in terms of accuracy alone is not im-
pacted by the multi-view ensemble classifier.
7 Conclusion
The ratio of the positive and negative scores of a document is useful sentiment fea-
tures for classification. For the datasets with large number of attributes in comparison
to the data samples, multi-view ensemble learning provides a novel classification
method. If the attribute relevance analysis is performed it may be possible to create
multiple views using the ranking information of the attributes besides feature reduc-
tion. Few attributes such as top 20-30% top ranked attributes may be sufficient for
classification task in comparison to the entire feature set. Therefore, multi-view learn-
ing for the classifier of datasets such as the poem document data provides a useful
option for the classification task. For the future work, more sentiment features of the
poem data and other such datasets may be extracted. Partitioning criteria for multi-
view creation and suitable ensemble methods for classifier may be explored. Empiri-
cal study for right number of views for a dataset is essential for improved classifica-
tion for a given dataset.
References
1. http://oxforddictionaries.com/definition/english/poem
2. Andrea, E., Sebastiani, F.: SentiWordNet: A Publicly Available Lexical Resource for Opi-
nion Mining. In: Proceedings of the 5th Conference on Language Resources and Evalua-
tion, pp. 417–422 (2006)
Multi-view Ensemble Learning for Poem Data Classification Using SentiWordNet 65
3. Hamouda, A., Rohaim, M.: Reviews Classification Using SentiWordNet Lexicon. In:
World Congress on Computer Science and Information Technology (2011)
4. Devitt, A., Ahmad, K.: Sentiment Polarity Identification in Financial News: A Cohesion-
based Approach. In: Proceedings of the 45th Annual Meeting of the Association of Com-
putational Linguistics, Prague, pp. 984–991 (2007)
5. Aurangzeb, K., Baharum, B., Khairullah, K.: Sentence Based Sentiment Classification
from Online Customer Reviews. In: FIT (2010)
6. Amini, M., Usunier, N., Goutte, C.: Learning from Multiple Partially Observed Views -
An Application to Multilingual Text Categorization. In: Advances in Neural Information
Processing Systems (2009)
7. Yuhong, G., Min, X.: Cross Language Text Classification via Subspace Co-Regularized
Multi-View Learning. In: Proceedings of the 29th International Conference on Machine
Learning, Edinburgh, Scotland, UK (2012)
8. Ping, G., QingSheng, Z., Cheng, Z.: A Multi-view Approach to Semi-supervised Docu-
ment Classification with Incremental Naive Bayes. Computers & Mathematics with Appli-
cations 57(6), 1030–1036 (2009)
9. Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Spyropoulos, C.D.: An Experimental
Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-
mail Messages. In: Proceedings of the 23rd Annual Int. ACM SIGIR Conference on Re-
search and Development in Information Retrieval, pp. 160–167 (2000)
10. Richter, G., MacFarlane, A.: The Impact of Metadata on the Accuracy of Automated Pa-
tent Classification. World Patent Information 37(3), 13–26 (2005)
11. Moens, M.F., Dumortier, J.: Text Categorization: the Assignment of Subject Descriptors to
Magazine Articles. Information Processing and Management 36(6), 841–861 (2000)
12. Shih, L.K., Karger, D.R.: Learning Classifiers: Using URLs and Table Layout for Web
Classification Tasks. In: Proceedings of the 13th International Conference on World Wide
Web, pp. 193–202 (2004)
13. Yang, Y.: An Evaluation of Statistical Approaches to Text Categorization. Information Re-
trieval 1(1-2), 67–88 (1999)
14. Scheffer, T.: Email Answering Assistance by Semi- Supervised Text Classification. Intel-
ligent Data Analysis 8(5), 481–493 (2004)
15. Zhang, T., Oles, F.: Text Categorization Based on Regularized Linear Classifiers. Informa-
tion Retrieval 4, 5–31 (2001)
16. Lee, P.Y., Hui, S.C., Fong, A.C.: Neural networks for web content filtering. IEEE Intelli-
gent System 17(5), 48–57 (2002)
17. Noraini, J., Masnizah, M., Shahrul, A.N.: Poetry Classification Using Support Vector Ma-
chines. Journal of Computer Science 8(9), 1441–1446 (2012)
18. Tizhoosh, H.R., Dara, R.A.: On Poem Recognition. Pattern Analysis Application 9(4),
325–338 (2006)
19. Vipin, K., Sonajharia, M.: Poem Classification using Machine Learning Approach. In: Ba-
bu, B.V., Nagar, A., Deep, K., Bansal, M.P.J.C., Ray, K., Gupta, U. (eds.) SocPro 2012.
AISC, vol. 236, pp. 675–682. Springer, Heidelberg (2014)
20. Yuhong, G., Min, X.: Cross Language Text Classification via Subspace Co-regularized
Multi-view Learning. In: ICML (2012)
21. Massih-Reza, A., Nicolas, U., Cyril, G.: Learning from Multiple Partially Observed Views
- an Application to Multilingual Text Categorization. In: NIPS, pp. 28–36 (2009)
22. Vipin, K., Sonajharia, M.: Mood Classification of Lyrics using SentiWordNet. In: ICCCI
2012. IEEE Xplore (2013)
66 V. Kumar and S. Minz
23. Kerstin, D.: Are SentiWordNet Scores Suited for Multi-Domain Sentiment Classification?
In: ICDIM, pp. 33–38 (2009)
24. Chang, X., Dacheng, T., Chao, X.: A Survey on Multi-view Learning. CoRR
abs/1304.5634 (2013)
25. Shiliang, S.: A Survey of Multi-view Machine Learning. Neural Computing & Applica-
tion. Springer, London (2013)
26. Hotelling, H.: Relations between Two Sets of Variates. Biometrika 28, 321–377 (1936)
27. Blum, A., Mitchell, T.: Combining Labeled and Unlabeled Data with Co-training. In: Pro-
ceedings of the 11th Annual Conference on Computational Learning Theory, pp. 92–100
(1998)
28. Wei, B., Pal, C.: Cross Lingual Adaptation: An Experiment on Sentiment Classification.
In: Proceedings of the ACL 2010 Conference Short Papers, pp. 258–262. Association for
Computational Linguistics (2010)
29. Wan, C., Pan, R., Li, J.: Bi-weighting Domain Adaptation for Cross-language Text Classi-
fication. In: Twenty-Second International Joint Conference on Artificial Intelligence
(2011)
30. Massih-Reza, A., Nicolas, U., Cyril, G.: Learning from Multiple Partially Observed Views
- an Application to Multilingual Text Categorization. In: NIPS 2009, pp. 28–36 (2009)
31. Wan, X.: Co-training for Cross-lingual Sentiment Classification. In: Proceedings of the
Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint
Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 235–243. Associa-
tion for Computational Linguistics (2009)
32. Amini, M.R., Goutte, C.: A Co-classification Approach to Learning from Multilingual
Corpora. Machine Learning 79(1), 105–121 (2010)
33. Chen, Q., Sun, S.: Hierarchical Multi-view Fisher Discriminant Analysis. In: Leung, C.S.,
Lee, M., Chan, J.H. (eds.) ICONIP 2009, Part II. LNCS, vol. 5864, pp. 289–298. Springer,
Heidelberg (2009)
34. Sun, S.: Multi-view Laplacian Support Vector Machines. In: Tang, J., King, I., Chen, L.,
Wang, J. (eds.) ADMA 2011, Part II. LNCS, vol. 7121, pp. 209–222. Springer, Heidelberg
(2011)
35. Xu, Z., Sun, S.: An Algorithm on Multi-view Adaboost. In: Wong, K.W., Mendis, B.S.U.,
Bouzerdoum, A. (eds.) ICONIP 2010, Part I. LNCS, vol. 6443, pp. 355–362. Springer,
Heidelberg (2010)
36. Muslea, I., Minton, S., Knoblock, C.: Active Learning with Multiple Views. Journal of Ar-
tificial Intelligence Ressearch 27, 203–233 (2006)
37. Xu, Z., Sun, S.: Multi-view Transfer Learning with Adaboost. In: Proceedings of the 23rd
IEEE International Conference on Tools with Artificial Intelligence, pp. 399–340 (2011)
38. De Sa, V., Gallagher, P., Lewis, J., Malave, V.: Multi-view Kernel Construction. Machine
Learning 79, 47–71 (2010)
39. Chen, X., Liu, H., Carbonell, J.: Structured Sparse Canonical Correlation Analysis. In:
Proceedings of the 15th International Conference on Artificial Intelligence and Statistics,
pp. 199–207 (2012)
40. http://wordnet.princeton.edu/
41. Jiawei, H., Micheline, K.: Data Mining Concepts and Techniques, 2nd edn. Elsevier
(2006)
A Prototype of an Intelligent Search Engine
Using Machine Learning Based Training
for Learning to Rank
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 67
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_9, © Springer International Publishing Switzerland 2014
68 P. Rai et al.
2 Related Work
In any ranking model, the ranking task is performed by using a ranking modelf (q, d)
to sort the documents, where q denotes a query and d denotes a document.
Traditionally, the ranking model f (q, d) is created without training. In the well known
Okapi BM25 model [3], for example, it is assumed that f (q, d) is represented by a
conditional probability distribution P (r | q, d) where r takes on 1 or 0 as value and
denotes being relevant or irreverent, and q and d denote a query and a document
respectively. In Language Model for IR (LMIR),f (q, d) is represented as a conditional
probability distribution P (q | d). The probability models can be calculated with the
words appearing in the query and document, and thus no training is needed (only
tuning of a small number of parameters is necessary) [1].
It is fact that most users are uninterested in more than the first few results in a
search engine results page (SERP). Taking this into consideration, the Learning to
Rank from user feedback model [2] considers each query independently. Radlinski et
al [2] proposed the model where the log files only provide implicit feedback on a few
results at the top of the result set for each query. They referred to this as “a sequence
of reformulated queries or query chains”. They state that these query chains that are
available in search engine log files can be used to learn better retrieval functions.
Several other works present various techniques for collecting implicit feedback
from click-through logs [4, 8,11]. All are based on the concept of the Click through
rate (CTR) considers the fact that documents clicked on in search results are highly
likely to be very relevant to the query. This can then be considered as a form of
implicit feedback from users and can be used for improving the ranking function.
Kemp et al. [4] present a learning search engine that is based on actually transforming
the documents. They too use the fact that results clicked on are relevant to the query
and append the query to these documents. However, other works [9, 10] showed that
implicit click-through data is sometimes biased as it is relative to the retrieval
function quality and ordering. Hence this cannot be considered to be absolute
feedback. Some studies [13, 14] have attempted to account the position-bias of click.
Carterette and Jones [12] proposed to model the relationship between clicks and
relevance so that clicks can be used to unbiasedly evaluate search engine performance
A Prototype of an Intelligent Search Engine Using Machine Learning Based Training 69
when there is a lack of editorial relevance judgment. Other research [14, 5,, 6]
attempted to model user click behavior during search so that future clicks mayy be
accurately predicted based on
o observations of past clicks.
RankProp [7] is a neurall net based ranking model. It basically uses two processes -
a MSE regression on the cu urrent target values, and an adjustment of the target vallues
themselves to reflect the cu
urrent ranking given by the net. The end result is a mappping
of the data to a large numb ber of targets which reflect the desired ranking. RankPProp
has the advantage that it iss trained on individual patterns rather than pairs; howeever
the authors do not discuss the conditions at which it converges, and also it does not
provide a probabilistic moddel.
3 Proposed System
m
Fig. 1 depicts the architectture of the proposed system. The Swappers search enggine
API is built using HTML 5 and AJAX. The database is created using MySQL and
PHP is used to connect APII to the database.
The search engine is trained using two supervised machine learning algorithhms
namely selection based and d review based. The tags/weights are calculated to rank the
links in the training data-sset. Both the algorithms follow the inclusion of differrent
heuristics for the same. Thee weight of the link is determined by the frequency of the
keyword in content of the link
l and the position where it occurs. Also, heuristics llike
whether the keyword is wriitten in bold or italics; position where it occurs, for e,gg. in
page title, headings, metaadata etc; and the number of outgoing links having the
keyword in the URL are con nsidered while calculating the weight of the link.
70 P. Rai et al.
In the review based module, the user selects the links he wants to train the search
engine with and also rates those selected links. The review based module then
normalizes the two weights, one given by the user and the other calculated from the
keyword density. The weighting algorithm module finds a best fit line using gradient
descent technique and assigns weights to links which are relevant to the query.
In the stochastic gradient descent algorithm we are trying to find the solution to the
minimum of some function f(x). Given some initial value x0 for x, we can change its
value in many directions (proportional to the dimension of x: with only one
dimension, we can make it higher or lower). To figure out what is the best direction to
minimize f, we take the gradient ∇f of it (the derivative along every dimension of x).
Intuitively, the gradient will give the slope of the curve at that x and its direction will
point to an increase in the function. So we change x in the opposite direction to lower
the function value[20].
xk+1 = xk– λ ∇f(xk) (1)
The λ >0 is a small number that forces the algorithm to make small jumps. That
keeps the algorithm stable and its optimal value depends on the function. Given stable
conditions (a certain choice of λ), it is guaranteed that f(xk+1) ≤ f(xk).
The algorithm shown in Fig. 2 is used to train the system to rank the relevant
documents. In this method, the training set is given as an input and a best fit line is
found passing through maximum of these input training set points by minimizing the
distance between the ordinates of each of the training set points and a given line.After
getting the equation of the line, next time when the ranking of the data set is required,
the testing value should be inputted and the system will return the rank of the testing
set. The method to calculate the weight of the link is explained above. This weight
will be the y co-ordinate of the line. This weight will be substituted in the equation of
the line and the x- coordinate will be calculated and the x co-ordinate will be the rank
of the document which will be returned. The link with highest weight is labeled as
rank 1 and so on.
The equation of the line to be plotted is given by formula (2), where m is the slope
of the line and c is a constant. Both the values are calculated by the gradient descent
technique using the inputs of the training set.
Rank = m * weight + c (2)
Once the line and its slope is computed, the Graph Visualization module takes
input of ‘m’ and ‘c’ calculated by the gradient module and draws the graph. The rank
of the link is plotted on Y-axis and the weight of the link is plotted on X-axis. The
basic equation of the line plotted is using the equation (2). The graph module uses
canvas of the HTML to draw the graph and also makes use of JavaScript to draw the
graph. Graph visualization helps in understanding the training process better and in
comparing the two training techniques.
4 Experimental Results
The training of the selection based and the review based systems use the same set of
links as selected by the user to get trained with. Our database consists of around a few
thousand seed URLs from which more links are discovered by crawling these web
72 P. Rai et al.
pages. Since the crawler uses BFS technique andhence will fetch the links ofrelevant
web pages till a particular average depth, hence providing with an average of 2^(h+1)
links where 'h' depends on the total number of links extracted from each seed URL to
fetch around 20-25 links, which makes 'h' to vary from 3-4, to be added depending on
the keyword. A thousand of these relevant set of URLs were chosen to be appended
into the training dataset.
Adatabase query is used to fetch keyword related links from Seed URL. The
system extracts links from the table named "training_data" in the database which gets
generated based on users choice during training. Each of this link is then crawled
upon by the various systems connected via wireless network. The keyword is then
searched on the content of the page given by the link. Processing is done based on the
places where it is present and other parameters explained later. The result is updated
back into the table to be looked up later during further processing.
Fig. 5. User rating and weight of Links given by Review based technique
In the selection based training, the ranking of the links were derived by taking only
the weights (includes the frequency of the keywords in the pages and the tags in the
HTML source code of the page) of the links as found by the weighing algorithm. In
the review based training, the ranking of the links were derived from the weights of
A Prototype of an Intelligent Search Engine Using Machine Learning Based Training 73
the links as in the selection based and the user assigned weights (user rating) to the
links. The user rating is collected through the feedback form provided to the user.
Finally, the values from these criteria are normalized to get the final weight for
each link in the training set which is further used to plot the graph using the gradient
descent algorithm. It plots all the points (weights) and finds the best fit line for both
the algorithms. The characteristics of the lines (slope and constant) formed and used
after training is done.
Fig. 6 shows the best-fit line made by the selection based technique. The line
corresponds to the data of ranks and weights in Fig. 3. The line covers a short range of
ranks.Fig. 7 shows the best fit line made by the review based technique. The line
corresponds to the data of ranks and weights in Fig. 5. The line covers a wide range of
ranks. This makes the review based technique of training more useful than selection
based technique in training to rank wide range of sets.
Fig. 6. Best-fit line from the selection based Fig. 7. Best-fit line from the review based
technique technique
5 Case Study
As a part of this case study, we compared the relative ranking of the links given by
selection based technique and review based technique with the search results of
Google. Google gives its search results based on many parameters and hence we have
not compared the absolute ranks of the links. By comparing the relative ranks of the
links, we are comparing the rank of the link given by our system to its rank given by
Google search results relative to the ranks of the other links.
The links and ranks given in Fig. 3and Fig. 4 are compared with the links and their
relative ranks in Fig. 5. The relative ranking of the review based technique is
matching with the relative ranking of the Google search as seen in Fig. 8.
74 P. Rai et al.
In this paper, we present a learning based search engine prototype that uses machine
learning techniques to infer the effective ranking function. The experimental results
show that the review based technique of training performs better and is more accurate
than the selection based learning technique. The review based technique is more
accurate because the two weights - the weight assigned by the user and the weight of
the keyword density are considered and normalized.
We also tried to match the ranking given by selection based and review based
learning system with Google ranking. Google had many more results to offer based
on different parameters but we tried to compare the ranking of the links given by our
system to the ranking of the same links on Google. The review based learning
technique had ranking matching with the ranking of Google.Supervised training is
used in the system, and this could be made into a semi-supervised or unsupervised
training. More parameters for considering the weight of the document can be used
which would give a better result.
As future work, we are focusing on using these trained systems to determine the
rank of a link. The values of slope ‘m’ and the constant ‘c’ mentioned in equation (2)
will be stored in the database for each query. When the rank of the new link is to be
determined, the weight of the link will be calculated by the weighing algorithm. The
rank of the link will be calculated by substituting the weight of the link in equation (2)
and retrieving the values of ‘m’ and ‘c’ from the database.
A Prototype of an Intelligent Search Engine Using Machine Learning Based Training 75
References
1. Croft, W.B., Metzler, D., Strohman, T.: Search Engines -Information Retrieval in Practice.
Pearson Education (2009)
2. Radlinski, F., Joachims, T.: Query Chains: Learning to Rank from Implicit Feedback. In:
Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge
Discovery in Data Mining, pp. 239–248 (2005)
3. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley (1999)
4. Kemp, C., Ramamohanarao, K.: Long-term learning for web search engines. In: Elomaa,
T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 263–274.
Springer, Heidelberg (2002)
5. Caruana, R., Baluja, S., Mitchell, T.: Using the future to “sort out” the present: Rankprop
and multitask learning for medical risk evaluation. In: Advances in Neural Information
Processing System, pp. 959–965 (1996)
6. Gradient Descent Methods,
http://webdocs.cs.ualberta.ca/~sutton/book/ebook/node87.html
7. Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting
clickthrough data asimplicit feedback. In: Annual ACM Conference on Research and
Development in Information Retrieval, pp. 154–161 (2005)
8. Tan, Q., Chai, X., Ng, W., Lee, D.-L.: Applying co-training to clickthrough data for search
engine adaptation. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS,
vol. 2973, pp. 519–532. Springer, Heidelberg (2004)
9. Carterette, B., Jones, R.: Evaluating search enginesby modeling the relationship between
relevance andclicks. In: Advances in Neural Information ProcessingSystems, vol. 20, pp.
217–224 (2008)
10. Craswell, N., Zoeter, O., Taylor, M., Ramsey, B.: Anexperimental comparison of click
position-bias models. In: Proceedings of the International Conference on Web Search and
Web DataMining, pp. 87–94 (2008)
11. Dupret, G., Piwowarski, B.: User browsing model to predict search engine click data from
past observations. In: Proceedings of the 31st Annual International Conference on
Research and Development in Information Retrieval (2008)
12. Richardson, M., Dominowska, E., Ragno, R.: Predicting clicks: estimating the click-
through rate for new ads. In: Proceedings of the 16th International Conference on World
Wide Web, pp. 521–530 (2007)
13. Zhou, D., Bolelli, L., Li, J., Giles, C.L., Zha, H.: Learning user clicks in web search. In:
International Joint Conference on Artificial Intelligence (2007)
14. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In:
Proceedings of the 21st Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval, pp. 275–281 (1998)
Vegetable Grading Using Tactile Sensing
and Machine Learning
1 Introduction
The surge in online marketing, coupled with state of the art of food storage facilities
has sparked the online trade of fruits and vegetables in India (“Now, fruits and
vegetables are also just a click away”, The Hindu, COIMBATORE, May 12, 2012)
and elsewhere very recently. Reliable and automatic grading and sorting of objects is
an essential requirement for the realtime deployment of such economic systems. The
current trend in fruit and vegetable grading systems focuses mostly on imaging
systems rather than a fusion of tactile and visual inspection as done by human beings
[1], [2]. With the advent of low cost tactile pressure/force arrays and single point
sensors, the pattern of human palpation and imaging for sorting can be performed.
The aim of this work is to analyze the role of touch sensation for the above mentioned
goal, thereby realizing a mechanized palpation system using machine learning
techniques. In [3] we studied the response of deterministic and probabilistic learning
methodologies for robot assisted fruit grading using Decision Tree and Naïve Bayes
Classifiers. In this paper advanced learning techniques such as Support Vector
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 77
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_10, © Springer International Publishing Switzerland 2014
78 I. Bandyopadhyaya et al.
Machine (SVM) and K-Nearest Neighbor (KNN) methods have been used for the
same along with a comparative performance analysis. The rest of the paper is
organized as follows. Section II presents the Background. Section III elaborates the
proposed methodology of vegetable grading. The extracted parameters, the applied
machine learning methodologies and experimental setup are detailed in section IV , V
and VI, while results of the classification is discussed in section VII, followed by
conclusion and acknowledgement in section VIII and IX respectively.
2 Background
The researchers and scientists have been inventing different procedures of automatic
grading with increasing efficiency and decreasing computational cost [4],[5]. Most of
the investigations in this field are done by image processing. In [6] the authors
described a method to investigate the images of different phases of apples’ blemishes
based on neural network methods which also acted as classification method for
segmenting oranges [7], where more relevant features like Red/Green color
component ratio, Feret’s diameter ratio and textural components were used. Spectral
imaging technique for fruit grading is another effective method used [8], [9]. From
these spectral images fruits are classified using soft computing techniques [8],
whereas Alexios et al. [9] made it a realtime approach. Also Blasco et al. elaborated a
method of automatic fruit and vegetable grading through image based features
associated with a machine learning approach-Bayesian discriminant method whose
accuracy is similar at par with human being [10]. Other new features such as gabor
features using PCA kernel, color quantization, texture, shape and size of objects;
extracted through image processing for sorting of fruits and vegetables are described
in [11-13].
With the advent of tactile sensing technology, it is possible to observe the local or
internal quality of an object [14-18] because vision gives only the global information
of any object. The vision and tactile information were merged by Edan et al. to detect
the freshness of the fruits [15] whereas Steinmetz et al. [14] proposed a sensor fusion
method to detect the firmness of the peaches. Kleynen et al. in [16] demonstrated a
compression test of apples to determine their quality. In recent years India also had
taken a step forward in the automatic fruit and vegetable grading industries as seen
from the works of Arivazhagan et al. [17] and Suresha et al. [18].
3 Proposed Methodology
The proposed methodology for the automated grading of fruit and vegetable consists
of two parts: Hardware unit for robotic palpation and force data acquisition and
Software unit for data analysis and classification.
A 1 Degree of Freedom (DOF) robotic gripper equipped with force sensors is used
to palpate the objects in the predefined sequence, during which force data is acquired
Vegetable Grading Using Tactile Sensing and Machine Learning 79
4 Parameter Selection
Machine learning is the way of mechanizing the process of day to day experience of
human being on different objects and their way of recognizing them. Researchers
have used different machine learning approaches to model human sensation applied in
discriminating ripe or rotten vegetable/fruits from the rest. In this experiment two
machine learning methods: SVM and KNN are used for vegetable classification. They
are briefly discussed below.
80 I. Bandyopadhyaya et al.
7 Quartile3 Q 3 = m e d ia n ( f i (Q 2 : e n d ))
comes, it examines the relationship of the test data to the training data sets using
distance functions. This distance function (Euclidean distance function is selected for
this work, given in Eqn.2) gives the idea of the neighbors of the test data. K denotes
the number of neighbors to be taken into consideration. According to the selected
neighbors the algorithm assigns the class of those neighbors to the test data.
n
d ( xi , x j ) = ( al ( xi ) − al ( x j )) 2 (2)
l =1
Where al (x) denotes the value of the lth attribute of instance x and
d ( xi , x j ) denotes the Euclidean distance between test and training data [20].
6 Experimental Setup
According to the percentage split of the dataset, taken for classification of the two
vegetables into Unripe/Green (denotes the harder objects), Moderately Ripe/Moderate
(denotes the objects of medium softness) and Ripe (denotes the softer objects), 30 and
20 datasets were taken for training and 14 and 10 datasets are taken for testing
respectively for tomatoes and ladies fingers. Among the datasets, three types of each
of the vegetables were chosen to plot a graph of displacement of the gripper versus
82 I. Bandyopadhyaya et al.
Fig. 3. Plot of displacement of the gripper versus applied force for tomatoes and ladies fingers
The applied force value on the object, delineated in Fig.3. Training means to fit a
model or a hypothesis to a dataset and testing on the other hand refers to the
application of the model or hypothesis to another set of data. Further cross validation
refers to validate the generated model by randomly partitioning the whole dataset.
Based on the training model and the test datasets the way of classification have
been discussed in the previous section. The comparison of the machine learning
techniques based on the performance parameters are tabulated in Table.2.
Considering Fig.3, the green and red lines are chosen in the graph for ladies-fingers
and tomatoes respectively. It is evident from the graph (Fig.3) itself that these two
vegetables can be classified from their force distribution plot because the tomatoes are
softer and required lesser force range to be operated on than the ladies fingers. For
individual case study, it can be contrived that the Unripe samples of vegetables
require the largest force then the Moderate instances (required medium force) and the
Ripe samples (requires least force).
The effectiveness of SVM and KNN for vegetable grading based on their softness
is analyzed below from the result rendered in Table 2. The accuracy of SVM during
testing of the vegetables are 64.23% and 60% respectively for tomato and ladies-
finger whereas those of KNN are 92.86% and 80%. Also accuracy during cross
Vegetable Grading Using Tactile Sensing and Machine Learning 83
validation is about 86% for SVM and 86.64% and 83.33% for tomato and ladies-
finger respectively. Besides, the relative absolute error is 60%-75% in the case of
SVM and 19%-43% for KNN; much lesser than those obtained in SVM. Other factors
like TPR, FPR and Precision also support the results obtained for accuracy and
relative absolute error. Thus for this scope KNN shows better performance than SVM
used with polynomial kernel.
8 Conclusion
The paper presents a mechanized way of sorting two vegetables, tomatoes and ladies-
fingers of different softness depending on two machine learning based classification
approaches, SVM and KNN and their performance study. The artificial grasping of
objects is created through a robotic gripper, controlled by a PIC32 microcontroller
and the acquired force data is treated on to generate 8 statistical parameters which
help to build a model of each classifier using the training set and additionally tested
and cross-validated through separate test set and whole dataset respectively.
The vegetables are classified into three categories namely Unripe, Moderately Ripe
and Ripe according to their maturity stages using both SVM and KNN. The
performance of SVM and KNN were considered in constrained number of dataset as
the widespread deployment of such a system will be possible if system performs with
limited training data, provided the data is uniform. The outcome of this classification
in the form of 5 performance factors which leads to the conclusion that the
classification through the instance based approach- KNN is better than the model
based approach- SVM, in the aspect of mainly Accuracy, Relative Absolute Error and
other factors. The polynomial SVM is used for classification which did not perform
well during classification. KNN outperforms the polynomial SVM in this experiment
of vegetable grading.
The paper explored a set up for mechanized vegetable grading using a robotic
gripper, PIC32 microcontroller as hardware for data generation and acquisition and
WEKA and MATLAB as software support for analyzing the data. The secured data is
assumed to follow a Gaussian distribution to extract the parameters for classification
through two dominantly uncorrelated methods, SVM (function or model based
approach) and KNN (lazy or instance based approach). Future work will involve
online fruit and vegetable grading technique using tactile sensing technology assisted
84 I. Bandyopadhyaya et al.
with vision. Thus the whole system will become more feasible and reliable to
overcome the real world problems.
References
1. Garcia-Ramos, F.J., Valero, C., Homer, I.: Margarita: Non-destructive fruit firmness
sensors: A review. Spanish Journal of Agricultural Research 3, 61–73 (2005)
2. Khalifa, S., Komarizadeh, M.H., Tousi, B.: Usage of fruit response to both force and
forced vibration applied to assess fruit firmness: A review. Australian Journal of Crop
Science 5 (2011)
3. Bandyopadhyaya, I., Babu, D., Kumar, A., Roychowdhury, J.: Tactile sensing based
softness classification using machine learning. In: 4th IEEE International Advance
Computing Conference (2014)
4. Mahendran, R., Jayashree, G.C., Alagusundaram, K.: Application of Computer Vision
Technique on Sorting and Grading of Fruits and Vegetables. Journal of Food Processing &
Technology (2012)
5. Wills, R.B.H.: An introduction to the physiology and handling of fruit and vegetables. Van
Nostrand Reinhold (1989)
6. Yang, Q.: Classification of apple surface features using machine vision and neural
networks. Computers and Electronics in Agriculture 9, 1–12 (1993)
7. Kondo, N., Ahmad, U., Monta, M., Murase, H.: Machine vision based quality evaluation
of Iyokan orange fruit using neural networks. Computers and Electronics in
Agriculture 29, 135–147 (2000)
8. Guyer, D., Yang, X.: Use of genetic artificial neural networks and spectral imaging for
defect detection on cherries. Computers and Electronics in Agriculture 29, 179–194 (2000)
9. Aleixos, N., Blasco, J., Navarron, F., Molto, E.: Multispectral inspection of citrus in real-
time using machine vision and digital signal processors. Computers and Electronics in
Agriculture 33, 121–137 (2002)
10. Blasco, J., Aleixos, N., Molto, E.: Machine vision system for automatic quality grading of
fruit. Biosystems Engineering 85, 415–423 (2003)
11. Zhu, B., Jiang, L., Luo, Y., Tao, Y.: Gabor feature-based apple quality inspection using
kernel principal component analysis. Journal of Food Engineering 81, 741–749 (2007)
12. Lee, D.J., Chang, Y., Archibald, J.K., Greco, C.G.: Color quantization and image analysis
for automated fruit quality evaluation. In: IEEE International Conference on Automation
Science and Engineering, pp. 194–199 (2008)
13. Mustafa, N., Fuad, N., Ahmed, S., Abidin, A., Ali, Z., Yit, W., Sharrif, Z.: Image
processing of an agriculture produce: Determination of size and ripeness of a banana. In:
International Symposium on Information Technology (2008)
Vegetable Grading Using Tactile Sensing and Machine Learning 85
14. Steinmetz, V., Crochon, M., Maurel, V., Fernandez, J., Elorza, P.: Sensors for fruit
firmness assessment: comparison and fusion. Journal of Agricultural Engineering
Research 64, 15–27 (1996)
15. Edan, Y., Pasternak, H., Shmulevich, I., Rachmani, D., Guedalia, D., Grinberg, S., Fallik,
E.: Color and firmness classification of fresh market tomatoes. Journal of Food Science 62,
793–796 (1997)
16. Kleynen, O., Cierva, S., Destain, M.: Evolution of Pressure Distribution during Apple
Compression Tests Measured with Tactile Sensors. In: International Conference on
Quality in Chains. An Integrated View on Fruit and Vegetable Quality, pp. 591–596
(2003)
17. Arivazhagan, S., Shebiah, R., Nidhyanandhan, S., Ganesan, L.: Fruit recognition using
color and texture features. Journal of Emerging Trends in Computing and Information
Sciences, 90–94 (2010)
18. Ravikumar, M., Suresha, M.: Dimensionality Reduction and Classification of Color
Features data using SVM and kNN. International Journal of Image Processing (2013)
19. Mitchell, T.: Machine learning. McGraw Hill, Burr Ridge (1997)
20. Alpaydin, E.: Introduction to Machine Learning. MIT Press (2004)
Neural Networks with Online Sequential Learning
Ability for a Reinforcement Learning Algorithm
1 Introduction
Reinforcement learning (RL) paradigm is a computationally simple and direct
approach to the adaptive optimal control of nonlinear systems [1]. In RL, the learning
agent (controller) interacts with an initially unknown environment (system) by
measuring states and applying actions according to its policy to maximize its
cumulative rewards. Thus, RL provides a general methodology to solve complex
uncertain sequential decision problems, which are very challenging in many real-
world applications.
Often the environment of RL is typically formulated as a Markov Decision Process
(MDP), consisting of a set of all states , a set of all possible actions , a state
transition probability distribution P : × × → [0,1] , and a reward function
R : × → . When all components of the MDP are known, an optimal policy can
be determined, e.g., using dynamic programming.
There has been a great deal of progress in the machine learning community on
value-function based reinforcement learning methods [2]. In value-function based
reinforcement learning, rather than learning a direct mapping from states to actions,
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 87
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_11, © Springer International Publishing Switzerland 2014
88 H. Shah and M. Gopal
the agent learns an intermediate data-structure known as a value function that maps
states (or state-action pairs) to the expected long-term reward. Value-function based
learning methods are appealing because the value function has well-defined semantics
that enable a straight forward representation of the optimal policy, and theoretical
results guaranteeing the convergence of certain methods [3], [4].
Q-learning is a common model-free value function strategy for RL [2]. Q-learning
system maps every state-action pair to a real number, the Q-value, which tells how
optimal that action is in that state. For small domains, this mapping can be
represented explicitly by a table of Q-values. For large domains, this approach is
simply infeasible. If, one deals with large discrete or continuous state and action
spaces, it is inevitable to resort to function approximation, for two reasons: first to
overcome the storage problem (curse of dimensionality), second to achieve data
efficiency (i.e., requiring only a few observations to derive a near-optimal policy) by
generalizing to unobserved states-action pairs. There is a large literature on RL
algorithms using various value-function estimation techniques. Neural network
function approximation (NN Q-learning) is one of the competent RL frameworks to
deal with continuous space problem. NN generalizes among states and actions, and
reduces the number of Q-values stored in lookup table to a set of NN weights. The
back propagation NN with sigmoidal activation function can be used to learn the
value function [5]. However, neural network function approximators suffer from a
number of problems like learning becomes difficult when the training data are given
sequentially, difficult to determine structural parameters, and usually result in local
minima or over fitting. Consequently, NN function approximation can fail at finding
the correct mapping from input state-action pairs to output Q-values.
In NN, the methods used to update the network parameters can broadly be divided
into batch learning and sequential learning. In batch learning, it is assumed that the
complete training data are available before training commences. The training usually
involves cycling the data over a number of epochs. In sequential learning, the data
arrive one by one or chunk by chunk, and the data will be discarded after the learning
of the data is completed. In practical applications, new training data may arrive
sequentially. In order to handle this using batch learning algorithms, one has to retrain
the network all over again, resulting in a large training time.
Neural networks with on-line sequential learning ability employ a procedure that
involves growing and/or pruning networks iteratively as the training data are
presented. Learning is achieved through a combination of new neuron allocation and
parameter adjustment of existing neurons. New neurons are added if presented
training patterns fall outside the range of existing network neurons. Otherwise, the
network parameters are adapted to better fit the patterns.
A significant contribution was made by Platt [6] through the development of an
algorithm called resource-allocating network (RAN) that adds hidden units to the
network based on the novelty of the new data in the process of sequential learning. An
improved approach called RAN via extended Kalman filter (RANEKF) [7] was
provided to enhance the performance of RAN by adopting an extended Kalman filter
(EKF) instead of the least means square (LMS) method for adjusting network
parameters. They all start with no hidden neurons and allocate the new hidden units
when some criteria are satisfied. However, once the hidden neuron is generated, it will
never be removed. The minimal resource allocating network (mRAN) [8, 9] is an
improved version of RAN, as growing and pruning neurons are achieved with a
Neural Networks with Online Sequential Learning Ability 89
certain criteria based on sliding windows. All the weights and the center of each
hidden neuron are tuned until certain error condition is satisfied. So, a compact
network can be implemented. Other improvements of RAN developed in [10] and
[11], take into the consideration the pseudo-Gaussian (PG) function and orthogonal
techniques including QR factorization and singular value decomposition (SVD), and
have been applied to the problem of time series analysis. In [12, 13], a growing and
pruning RBF (GPA-RBF) neural network, and generalized growing and pruning
RBF(GGPA-RBF) neural network approaches have been presented. The idea is to
only adjust the weights of the neuron that is nearest to the most recently received
input, instead of the weights of all the neurons. Significance of a neuron is measured
by the average information contained in that neuron. It requires estimating the
distributions and the range of input samples, as well as choosing some of the
parameters appropriately before training. An online sequential extreme learning
machine (OS-ELM) [14] has been proposed for learning data one-by-one and/or
chunk-by-chunk with fixed or varying chunk size. However, the parameters of hidden
nodes are randomly selected and only the output weights are analytically updated
based on the sequentially arriving data. It should be noted that the structure in OS-
ELM is fixed in advance by user.
Vamplew and Ollington in [15] compare the global (static structure such as multi-
layer perceptron) versus local constructive (dynamic structure such as resource
allocation network) function approximation for on-line reinforcement learning. It has
been shown that the globally-constructive algorithms are less stable, but that on some
tasks they can achieve similar performance to a locally-constructive algorithm, whilst
producing far more compact solutions. In contrast, the RAN performed well on three
benchmark problems⎯Acrobot, Mountain-Car, and Puddleworld, for both the on-line
and off-line measures. However, this performance was only achieved by choosing
parameters that allowed the RAN to create a very large number of hidden neurons.
Shiraga et al.. [16] proposed a neural network model with incremental learning
ability for reinforcement learning task, with resource allocating network with long-
term memory (RAN-LTM). From the simulation results, they showed that their
proposed model could acquire more accurate action-values as compared with the
following three approaches to the approximation of action–value functions: tile
coding, a conventional NN model, and a version of RAN-LTM [17].
In this paper, a novel online sequential learning neural network model design for
RL is proposed. We explore the use of constructive approximators (such as mRAN,an
improved version of RAN) which build their structure on-line during learning/training,
starting from minimal architecture. Use of mRAN We develop a mRAN function
approximation approach to RL system and demonstrate its potential through a case
study – two-link robot manipulator. The mean square error accuracy, computational
cost and robustness properties of this scheme are compared with the scheme based on
global function approximation with static structure (such as NN).
The remaining part of the paper is organized as follows: Section 2 presents
architecture of mRAN and value function approximation for RL systems. Section 3
gives details of on-line sequential learning of mRAN algorithm. Section 4 compares
the empirical performance on the basis of simulation results. Finally, in Section V, the
conclusions are presented.
90 H. Shah and M. Gopal
st
mRAN
Action selector
γ V (s t+1 ) − st (desired)
ε
( -greedy)
+
+
rt
Kv Error
apd
+
akc st
ac Unknown
+
c1,σ1 bias
x1 α1
α0
x2 α2 ŷ
c2 ,σ 2
αk
xn
ck ,σ k
{ }
The state-action pair (st , at ) ; where st = s1t , s2t ,, snt ∈ is the current system
state and at is each possible discrete control action in action set = {ai }; i = 1, …, m
is the input of mRAN. The output of the network (Fig. 1(b)), the estimated Q-value
corresponding to (st , at ) with K hidden neurons, has the following form:
K
Q (st , at ) = yˆ t = f (x t ) = α 0 + α φ (x )
k =1
k k
t
(1)
where st ⎯⎯rt
→ st +1 is the state transition under the control action at ∈ at instant
t . In specific, control actions are selected using an exploration/exploitation policy in
order to explore the set of possible actions and acquire experience through the online
RL signals [2]. We use a pseudo-stochastic (ε-greedy)exploration as in [18].In ε-
greedy exploration, we gradually reduce the exploration (determined by the ε
parameter) according to some schedule; we have reduced ε to its 90 percent value
after every 50 iterations. The lower limit of parameter ε has been fixed at 0.002 (to
maintain exploration). Learning rate parameter, η ∈ (0,1] , can be used to optimize
92 H. Shah and M. Gopal
the speed of learning, and γ ∈ (0,1] is the discount factor which is used to control
the trade-off between immediate and future reward.
X t − c∗ > ε t (6)
n
2
ei
t
erms = i = n − ( M −1)
M
′
> emin (7)
where c∗ is the center of the hidden neuron which is closest to the current input
X ; emin , ε t , and e′min are thresholds to be selected appropriately. emin is an
t
instantaneous error check and is used to determine if the existing nodes are
insufficient to obtain a network output. ε t ensures that the new input is sufficiently
far from the existing centers and is given by ε t = max[ε max λ t , ε min ] where λ is a
decay constant ( 0 < λ < 1 ). e′min is used to check the root mean square error value
t
( erms ) of the output error over a sliding window size (M) before adding a hidden
neuron.
If the three error criteria in the growing step are satisfied, a new hidden neuron is
added to the network. The parameters associated with the new hidden neuron are
assigned as follows:
α k +1 = et
ck +1 = X t (8)
σ k +1 = X t − c∗
where is an overlap factor (kappa), which determines the overlap of the
hidden neurons responses with the hidden units in the input space.
If the observation does not meet the criteria for adding a new hidden neuron,
extended Kalman filter (EKF) learning algorithm is used to adjust the network
parameters w , given as:
w = [α0 , α1,c1T , σ1,, α K ,cTK , σ K ]T (9)
The update equations are given by
w t = w t-1 + k t et
k t is the Kalman gain vector given by: k t = P t −1B t [ R t + B t P t −1B t ]−1
T
where
and Bt is the gradient vector and has the following form:
Neural Networks with Online Sequential Learning Ability 93
2α1 2α1 2
B t = ∇w yˆ t = [ I , φ1 (X t ) I , φ1 (X t ) (X t − c1 )T , φ1 (X t ) X t − c1 ,
σ12 σ12 (10)
2α K 2α K 2
φK (X ) I , φK (X )
t t
(X − cK ) , φK (X )
t T t
X − cK ]
t T
σ K2 σ K2
Rt is the variance of the measurement noise, Pt is the error covariance matrix
which is updated by P t = [ I − k t B t ]P t −1 + QI ; where Q is a scalar that determines
T
t
2
x - ck
Okt = α k exp −
(13)
σ k2
t
where Okt is the output for the k hidden neuron at time instance t, and Omax
th
is the largest absolute hidden neuron output value at t.These normalized values are
then compared with a threshold δ . If any of them falls below this threshold for M
consecutive observations, then this particular hidden neuron is removed from the
network; and the dimensionality of the covariance matrix ( Pt ) in the EKF learning
algorithm is adjusted by removing rows and columns which are related to the pruned
unit.
94 H. Shah and M. Gopal
4 Simulation Experiments
To demonstrate the usefulness of mRAN function approximator in reinforcement
learning framework, we conducted experiments using the well-known two-link robot
manipulator tracking control problem.
+
+ τ + =
τ
β + η cosθ2 β θ2 ηθ12 sin θ2 η e1 cos(θ1 + θ2 ) 2dis 2
where τ dis = [τ 1dis τ 2 dis ] is the disturbance torque vector, and α = ( m1 + m 2 ) l12 ;
T
(θ k ,θk ); i = 1,2 . We define tracking error vector as: e t = θ t − θ t and cost function
i i r d
uses two function approximators; one each for the two links. The initial free
parameters are selected for constructing mRAN network as follows: The size of
sliding data window, M = 25 , the thresholds, ε max = 0.8, ε min = 0.5, emin = 0.02,
e′min = 0.002 , δ = 0.05 , overlap factor, = 0.87 , the decay constant, λ = 0.97 ,
and the estimate of the uncertainty in the initial values assigned to the parameters,
P 0 = 1.01 .
Neural Network Q-learning Controller (NNQC): Structurally, NNQC remains same
as mRANQC; the major difference is that the NN configuration comprises of one or
two hidden layers containing a set of neurons with tan-sigmoidal activation function.
The number of layers and neurons depends on the complexity and the dimensions of
the value-function to be approximated. We consider 18 hidden neurons. The
initialization of the NN weights is done randomly, and length of training samples (l)
for batch mode processing is chosen as 100.
In mRANQC (or NNQC) controller implementations, we have used controller
structure with an inner PD loop. Control action to the robot manipulator is a
combination of an action generated by an adaptive learning RL signal through mRAN
(or NN) and a fixed gain PD controller signal. The PD loop will maintain stability
until mRAN (or NN) controller learns, starting with zero initialized Q-values. The
controller, thus, requires no offline learning.
0.4 1.2
NNQC NNQC
0.3 1 mRANQC
mRANQC
0.2 0.8
0.1
E rro r (ra d )
0.6
E rro r (ra d )
0
0.4
-0.1
0.2
-0.2
0
-0.3
-0.2
0 1 2 3 4 5 6 7 8 9 10
-0.4
0 1 2 3 4 5 6 7 8 9 10 Time (sec)
Time (sec)
Fig. 2(a). Output tracking error (link1) Fig. 2(b). Output tracking error (link2)
Learning performance study: The physical system has been simulated for a single
run of 10 sec using fourth-order Runge-Kutta method, with fixed time step of 10
msec. Fig. 2 and Fig. 3 show the output tracking error (both the links) and control
torque (both the links) for both the controllers • NNQC and mRANQC, respectively.
96 H. Shah and M. Gopal
Table I tabulates the mean square error, absolute maximum error (max |e(t)|), and
absolute maximum control effort (max | τ |) under nominal operating conditions.
Training
MSE (rad) max |e(t)| (rad) max | τ | (Nm) Time
Controller
(sec)
Link 1 Link 2 Link 1 Link 2 Link 1 Link 2 ------
NNQC 0.0120 0.0065 0.1568 0.0779 83.5281 33.6758 376.78
300 200
NNQC NNQC
mRANQC 150 mRANQC
200
100
T o rq u e (N m )
T o rq u e (N m )
100
50
0
0
-100 -50
-100
-200 0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Time (sec)
Time (sec)
Fig. 3(a). Control torque (link1) Fig. 3(b). Control torque (link2)
From the results (Fig. 2 and Fig. 3, Table 1), we observe that training time for
mRANQC is lesser than NNQC. mRANQC outperforms NNQC, in terms of lower
tracking errors and the low value of absolute error and control effort for both the
links.
Robustness Study: In the following, we compare the performance under uncertainties
of NNQC and mRANQC. For this study, we trained the controller for 20 episodes,
and then evaluated the performance for two cases.
Effect of Payload Variations: The end-effector mass is varied with time, which
corresponds to the robotic arm picking up and releasing payloads having different
masses. The mass is varied as:
(a) t < 2 s ; m2 = 1 kg
(b) 2 ≤ t < 3.5 s ; m2 = 2.5 kg
(c) 3.5 ≤ t < 4.5 s ; m2 = 1 kg
(d) 4.5 ≤ t < 6 s ; m2 = 4 kg
(e) 6 ≤ t < 7.5 s ; m2 = 1 kg
(f) 7.5 ≤ t < 9 s ; m2 = 2 kg
(g) 9 ≤ t < 10 s ; m2 = 1 kg .
Neural Networks with Online Sequential Learning Ability 97
Figs. 4(a) and (b) show the output tracking errors (both the links) and Table II
tabulates the mean square error, absolute maximum error (max |e(t)|), and absolute
maximum control effort (max | τ |) at payload variations with time.
1.2
0.5
NNQC
NNQC
0.4 mRANQC
1 mRANQC
0.3
0.8
0.2
E rro r (ra d )
0.6
E rro r (ra d)
0.1
0 0.4
-0.1
0.2
-0.2
0
-0.3
-0.4 -0.2
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Time (sec) Time (sec)
Fig. 4(a). Output tracking error (link1) Fig. 4(b). Output tracking error (link2)
1.2
0.4 NNQC
NNQC 1 mRANQC
0.3 mRANQC
0.2 0.8
0.1
E rro r (ra d )
0.6
E rro r (ra d )
0
0.4
-0.1
0.2
-0.2
0
-0.3
-0.4 -0.2
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Time (sec) Time (sec)
Fig. 5(a). Output tracking error (link1) Fig. 5(b). Output tracking error (link2)
98 H. Shah and M. Gopal
Simulation results (Fig. 4 and Fig. 5, Table 2 and Table 3) shows better robustness
property for mRAN based controller in comparison with NN based controller.
6 Conclusions
In order to tackle the deficiency of global function approximator (such as NN), the
minimal resource allocation network (mRAN) is introduced in RL control system and
a novel online sequential learning algorithm based on mRAN presented. mRAN is a
sequential learning RBF network and has ability to grow and prune the hidden
neurons to ensure a parsimonious structure that is well suited for real-time control
applications.
From the simulation results, it is obvious that training time in mRAN based RL
system is much shorter compared to the NN based RL system. This is an important
feature for RL control systems from stability considerations. This feature is achieved
without any loss of performance.
References
1. Sutton, R.S., Barto, A.G., Williams, R.J.: Reinforcement learning is direct adaptive
optimal control. IEEE Control Syst. Mag. 12(2), 19–22
2. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge
3. Watkins CJCHLearning with delayed rewards. Ph. D. Thesis, University of Cambridge
(1989)
4. Singh, S., Jaakkola, T., Littman, M., Szpesvari, C.: Convergence results for single step on-
policy reinforcement learning algorithms. Machine Learning 38, 287–308 (2000)
5. Hagen, S.T., Kröse, B.: Neural Q-learning. Neural Comput. & Applic. 12, 81–88 (2003)
6. Platt, J.: A resource-allocating network for function interpolation. Neural Computation 3,
213–225 (1991)
7. Kadirkamanathan, V., Niranjan, M.: A function estimation approach to sequential learning
with neural networks. Neural Computation 5, 954–975 (1993)
8. Yingwei, L., Sundararajan, N., Saratchandran, P.: A sequential learning scheme for
function approximation using minimal radial basis function (RBF) neural networks. Neural
Computation 9, 461–478 (1997)
9. Yingwei, L., Sundararajan, N., Saratchandran, P.: Performance evaluation of a sequential
minimal radial basis function (RBF) neural network learning algorithm. IEEE Trans. on
Neural Network 9, 308–318 (1998)
10. Rojas, I., Pomares, H., Bernier, J.L., Ortega, J., Pino, B., Pelayo, F.J., Prieto, A.: Time
series analysis using normalized PG-RBF network with regression weights.
Neurocomputing 42, 267–285 (2002)
Neural Networks with Online Sequential Learning Ability 99
11. Salmeron, M., Ortega, J., Puntonet, C.G., Prieto, A., Improved, R.A.N.: sequential
prediction using orthogonal techniques. Neurocomputing 41, 153–172 (2001)
12. Huang, G.B., Saratchandran, P., Sundararajan, N.: An efficient sequential learning
algorithm for growing and pruning RBF (GAPRBF) networks. IEEE Transcript on System
Man and Cybern. B 34, 2284–2292 (2004)
13. Huang, G.B., Saratchandran, P., Sundararajan, N.: A generalized growing and pruning
RBF (GGAP-RBF) neural network for function approximation. IEEE Transcript on Neural
Network 16, 57–67 (2005)
14. Liang, N.Y., Huang, G.B., Saratchandran, P., Sundararajan, N.: A fast and accurate online
sequential learning algorithm for feed forward networks. IEEE Trans. on Neural
Network 17, 1411–1423 (2006)
15. Vamplew, P., Ollington, R.: Global versus local constructive function approximation for
on-line reinforcement learning. In: Zhang, S., Jarvis, R.A. (eds.) AI 2005. LNCS (LNAI),
vol. 3809, pp. 113–122. Springer, Heidelberg (2005)
16. Shiraga, N., Ozawa, S., Abe, S.: A reinforcement learning algorithm for neural networks
with incremental learning ability. In: Proceeding of the 9th International Conference on
Neural Information Processing, vol. 5, pp. 2566–2570 (2002)
17. Kobayashi, M., Zamani, A., Ozawa, S., Abe, S.: Reducing computations in incremental
learning for feed-forward neural network with long-term memory. In: Proc. International
Joint Conference on Neural Networks, pp. 1989–1994 (2001)
18. Shah, H., Gopal, M.: A fuzzy decision tree based robust Markov game controller for robot
manipulators. International Journal of Automatic and Control 4(4), 417–439 (2010)
19. Green, S.J.Z.: Dynamics and trajectory tracking control of a two-link robot manipulator. J.
Vibration Control 10(10), 1415–1440 (2004)
Scatter Matrix versus the Proposed Distance
Matrix on Linear Discriminant Analysis
for Image Pattern Recognition
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 101
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_12, c Springer International Publishing Switzerland 2014
102 E.S. Gopi and P. Palanisamy
elements of the vectors belonging to the identical class in the lower dimensional
space. However, it minimizes the sum of the variances of the elements of the
centroid vectors of all the classes in the lower dimensional space. This helps in
bringing the vectors closer to each other if it belongs to the identical class and
simultaneously separating the classes in the lower dimensional space.
The optimal solution (LDA bases) to the objective function is the set of Eigen-
−1
vectors corresponding to the significant Eigenvalues of the matrix SBW SW [11].
The columns of the matrix SW are in the column space spanned by the feature
vectors used to obtain the scatter matrix. The rank of the scatter matrix SW
should be at the most equal to the number of vectors used in the training set.
Hence, if the number of elements in the feature vector is larger than the num-
ber of vectors used to compute the scatter matrix SW , the matrix SW becomes
−1
singular and hence SW cannot be computed. This problem is known as small
sample size problem. There have been many methods attempted to solve this
problem [3], [8], [9].
In this paper, we explore the performances of the LDA using the proposed
distance matrix as the replacement of scatter matrix in LDA. Inner-product
LDA (special case of kernel LDA [11] ) is used to reduce computation time. The
problem of “small sample size problem” was taken care during realization by
using the method proposed in [9].
r
SB = (ci − c)(ci − c)T (1)
i=1
It is noted that the trace of the SW matrix gives the measurement about how
the vectors are closer to each other within the class in the higher dimensional
space and similarly, the trace of the SBW and SB matrices measure the sep-
aration of various classes in the higher dimensional space. This is because the
trace of the scatter matrix SW used in classical LDA measures the summation of
the squared distances between the vectors and the centroid of the corresponding
classes. In classical LDA, this is the parameter to be minimized in the lower
dimensional space. Similarly the trace of the scatter matrices SBW and SB mea-
sure the summation of the squared distances between the centroid vectors and
the mean of the centroid vectors .
Scatter Matrix versus the Proposed Distance Matrix 103
Consider three vectors in the particular class as shown in the Fig. 1(a). Let
the centroid of the class be represented as C in Fig. 1(a). Using the conventional
method (scatter matrix) ,the sum of the distances between the vectors with
its class is used as the measure of closeness. Instead,the proposed technique
(distance matrix) involves in computing the sum of the Euclidean distances
between the vectors 1 with 2, 1 with 3 and 2 with 3.
Fig. 1. (a)Illustration of the proposed technique, (b)Sample Images of the ORL face
database corresponding to the identical person with different poses
Let the summation of squared distances of all the vectors from the centroid
(conventional technique) and the actual squared Euclidean distances between
the vectors within the class (proposed technique) be represented as D1 and D2
respectively. By direct expansion of the equations, it can be easily shown that,
for the arbitrary single class with m vectors, D2 = mD1 . Thus by intuition ,we
understand that for the case of multiple classes with different size (number of
vectors in the class), Euclidean distance metric gives more weightage to the class
with more number of vectors and comparatively less importance to the classes
that has less number of vectors.
Hence Euclidean distance based measurement can be used to measure the
closeness of the vectors within the class and can also be used to measure the
separation of classes effectively. To make use of this Euclidean distance based
measurement in LDA, we are in need of symmetric matrix (whose trace is the
Euclidean distance between the vectors) as the replacement of the traditional
scatter matrices that are used in LDA. By intuition, we formulated the dis-
tance matrices whose traces measure the corresponding parameters. It can be
easily shown that the proposed distance matrix (symmetric matrix) satisfies the
properties that are exploited in LDA as described below.
Hence Euclidean distance based measurement can be used to measure the
closeness of the vectors within the class and can also be used to measure the
separation of classes effectively. To make use of this Euclidean distance based
measurment in LDA, we are in need of symmetric matrix (whose trace is the
Euclidean distance between the vectors) as the replacement of the traditional
scatter matrices that are used in LDA. We formulated the distance matrices
whose traces measure the corresponding parameters. It can be easily shown that
the proposed distance matrix (symmetric matrix) satisfies the properties that
are exploited in LDA as described below.
1. The trace of the arbitrary scatter matrix S is the sum of variances of the
individual elements of the training set vectors. Therefore, the trace of the
104 E.S. Gopi and P. Palanisamy
are the within-class distance matrix and between-class distance matrix respec-
tively.
trace(W T DB W )
JD (W ) = (2)
trace(W T DW W )
Note that the sum of the diagonal elements of the matrix (Xij − Xik )(Xij −
Xik )T is the distance between the vectors Xij and Xik . Note that the above ma-
trices are symmetric. It is also not difficult to show that the proposed distance-
within class matrix and distance-betweeen class matrix satisfies the properties
listed in Section 2. Thus the proposed distance matrix based LDA involves
in estimating the optimal value of W such that JD (W ) mentioned in (2) is
maximized.
Scatter Matrix versus the Proposed Distance Matrix 105
4 Experiments
The experiments have been performed with “ORL face database” [13] (refer Fig.
1(b)) to compare the performance in terms of the prediction accuracy of the
proposed technique using distance matrix with the scatter matrix. The inner-
product based LDA are used in both the cases to reduce computation time.
“Small sample size problem” is also taken care using the technique proposed in
[9] in both the cases. Thus the overall steps to map the arbitrary feature vector
to the lower dimensional space is as described below.
Steps to map the arbitrary feature vector to the lower dimensional space:
1. Compute all the inner-product training feature vectors corresponding to all
the training feature vectors in the feature space. Compute SB and SW ma-
trices using the inner-product training sets.
2. Project all the inner-product training feature vectors to the null space of
SW as described as follows. Compute the Eigenvectors corresponding to the
significant Eigenvalues using Eigen decomposition. The Eigenvectors thus
obtained are arranged colum wise to obtain the transformation matrix T .
The arbitrary inner-product training feature vector v is projected to the null
space of SW using the projection matrix T T T as v − T T T v. This is repeated
for all the inner-product training feature vectors.
3. Compute the between-class scatter symmetric matrix SB (say) using the
projected vectors.Compute the transformation matrix that maximizes the
trace of the matrix U T SB U , i.e., the Eigenvectors of the matrix SB are
arranged rowwise to form the transformation matrix U T . Compute the inner-
product vector corresponding to the arbitrary feature vector in the higher
dimensional space y and is represented as z.
4. Once the matrix U is obtained, transformation of the arbitrary vector y
in the higher dimensional space to the lower dimensional space is obtained
as U T z.
In a similar fashion, transformation of the arbitrary vector to the lower di-
mensional space is obtained using the proposed distance matrix just by replacing
the scatter matrices SB and SW by distance matrices DB and DW . Also note
that the scatter matrices and distance matrices mentioned in the above steps are
normalized such that the maximum value in the matrices is 1 to easify the selec-
tion of the significant Eigen vectors. It is also noted that the nearest neigbour
(based on the Euclidean distance) classifier is used to compute the percentage
of success(POS) of the training set.
collect atleast two vectors from each classes to compute 5 the matrix. so i varies
from 2 to 5. There are 40 classes in the database. so i=2 ni = 40. The vector
[n2 n3 n4 n5 ] of the particular trial is defined as the training vector distribution
for that trial. Experiments are performed for all the valid vector distributions (ie
12431 trials ). For the arbitrary trial, classes are randomly chosen from which
the particular number of training vectors are collected. Also, the selection of
vectors collected from the particular class are chosen randomly.
The absolute difference between the POS obtained using distance matrix and
the scatter matrix for the particular trial is defined as the significant number.
Number of trials when greater POS is obtained using distance matrix under
various significant numbers is listed along with the cases when the scatter matrix
performs better (greater POS) in the Table 1. The number of trials is greater
in case of when the distance matrix is used when compared with the case of
scatter matrix with high significant numbers (highlighted as bold in Table 1).
This suggests the importance of using the distance matrix in place of scatter
matrix in LDA.
The set of trials during which distance matrix performs better with significant
values greater than 7.5 are selected and the corresponding vector distributions
are collected. Number of trials during which the distance matrix performs better
with the significant number greater than 7.5 is 76 and the for the case of scatter
matrix ,it is 65.
Thus the total number of trials belonging to the cases when the significant val-
ues greater than 7.5 is 65+75=141. For every such collected vector distribution,
Scatter Matrix versus the Proposed Distance Matrix 107
Fig. 2. (a) Illustration that the variation of POS (75 positives, 190 negatives, 17 zeros)
is less in case of when distance matrix is used when compared with the case of scatter
matrix. Experiments are performed using the trials belonging to the significant number
n > 7.5. More number of negative values with higher magnitude values indicate that
the distance matrix performs better, (b)Illustration (131 positive values,288 negative
values and 37 zeros values) for the case with significant number n > 5
number of vectors from the individual classes are chosen randomly and is kept as
fixed distribution. Actual vectors collected from the individual classes satisfying
the fixed distribution are chosen randomly and are subjected to LDA problem
using both scatter and distance matrices. This is repeated for 40 different combi-
nations of selection of actual vectors. POS is noted for 40 different combinations
using both scatter matrix and the distance matrix. This is repeated for all the
collected vector distributions. The range of POS when scatter matrix is used is
calculated as the difference between minimum POS and the maximum POS ob-
tained for the particular vector distribution when scatter matrix is used.Let it be
range(s). Similarly the range of POS when distance matrix is used is calculated
as range(d) for every 40 different combinations.
The difference between the ranges i.e. range(d)-range(s) gives the information
about the variation of POS when scatter matrix is used and when the distance
matrix is used. If the value is positive,variation is more in case of when distance
matrix is used. Similarly if the value is negative, variation is more in case of when
scatter matrix is used. The complete experiment (mentioned above) is repeated
again as the second iteration. The difference between the range of POS obtained
using distance matrix and the scatter matrix obtained in both the iterations
is plotted in the Fig. 2(a) for 2*141=282 trials (2 indicate two iterations,141
indicate number of trails in one iteration) . From the graph (75 positives, 190
negatives, 17 zeros) we understand that the variation is more, mostly when the
scatter matrix is used and the variation is less when the distance matrix is used.
The experiment mentioned above is repeated for the trials corresponding to the
significant value of greater than 5 when distance matrix performs better. The
result is plotted in Fig. 2(b). In this case,only one iteration is performed as
the number of trials for one iteration are large i.e. 456 values. From the graph
(131 positive values,288 negative values and 37 zeros values) justifies that the
variation of POS is less when distance matrix is used.
108 E.S. Gopi and P. Palanisamy
To conclude, the variation of POS on the training set is less in case of when
distance matrix is used when compared with the scatter matrix. This shows
from intuition that the distance matrix is less sensitive to noise. Currently, we
are investigating on the effect of the noisy data on the distance matrix and the
importance of replacing scatter matrix distance matrix in LDA problem under
noisy environment. Study of the performance of the distance matrix in all the
variants of LDA is left as the natural extension of the work. We also identified
that the distance matrix and the scatter matrix ends up with the same matrix
when the number of vectors in every classes are identical.
Acknowledgment. Authors would like to thank the reviewers for their con-
structive comments. First author would like to thank Prof. K.M.M. Prabhu,
Department of Electrical Engineering, Indian Institute of Technology Madras,
India for his support.
References
1. Jolliffe, T.: Principal Component Analysis. Springer Verlag (1986)
2. Hyvarinen, A.: Survey on Independent Component Analysis. Neural Computing
Surveys 2, 94–128 (1999)
3. Park, C., Park, H.: A Comparison of generalized linear discriminant analysis algo-
rithms. Pattern Recognition 41, 1083–1097 (2008)
4. Nenadic, Z.: Information discriminant analysis:Feature Extraction with an
Information-Theoritic objective. IEEE Transactions on Pattern Analysis and Ma-
chine Intelligence 29(8), 1394–1407 (2007)
5. Fisher, R.A.: The use of Multiple Measures in Taxonomic Problems. Ann. Eugen-
ics 7, 179–188 (1936)
6. Kumar, N., Andreou, A.G.: Heteroscedastic Discriminant Analysis and Reduced
Rank HMMs for Improved Speech Recognition. Speech Comm. 26, 283–297 (1998)
7. Torkkola, K.: Discriminative Features for Document Classification. In: Proceedings
of 16th International Conference on Pattern Recognition, pp. 472–475 (2002)
8. Chen, L.-F., Liao, H.-Y.M., Ko, M.-T., Lin, J.-C., Yu, G.-J.: A New LDA-Based
Face Recognition System which Can Solve the Small Sample Size Problem. Pattern
Recognition 33(10), 1713–1726 (2000)
9. Cevikalp, H., Wilkes, M.: Discriminative Common vectors for Face Recognition.
IEEE Transactions in Pattern Analysis and Machine Intelligence 27(1), 4–13 (2005)
10. Martinez, A.M., Kak, A.C.: PCA versus LDA. IEEE Transactions in Pattern Anal-
ysis and Machine Intelligence 23(2), 228–233 (2001)
11. Bishop, C.M.: Pattern recognition and Machine intelligece (2006)
12. Strang, G.: Linear algebra and its applications, pp. 102–107. Thomson,
Brooks/Cole (2006)
13. Source: http://www.orl.co.uk/facedatabase.html
Gender Recognition Using Fusion of Spatial
and Temporal Features
Abstract. In the paper, a gender recognition scheme has been proposed based
on fusion of spatial and temporal features. As a first step, face from the image is
detected using Viola Jones method and then spatial and temporal features are
extracted from the detected face images. Spatial features are obtained using
Principal Component Analysis (PCA) while Discrete Wavelet Transform
(DWT) has been applied to extract temporal features. In this paper we
investigate the fusion of both spatial and temporal features for gender
classification. The feature vectors of test images are obtained and classified as
male or female by Weka tool using 10 fold cross validation technique. To
evaluate the proposed scheme FERET database has been used providing
accuracy better than the individual features. Experimental result shows 9.77%
accuracy improvement with respect to spatial domain recognition system.
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 109
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_13, © Springer International Publishing Switzerland 2014
110 S. Biswas and J. Sil
obtained using principal component analysis (PCA) algorithm and then a subset of the
features is selected using genetic algorithm. Performance of the method is compared
using classifiers namely, Bayesian, Neural Network, Support Vector Machine (SVM)
and Linear Discriminant analysis (LDA). Among the four classifiers SVM achieved
the best result.
Wavelet transform [19] is an ideal tool to analyze images of different gender. It
discriminates among several spatial orientations and decomposes images into
different scale orientations, providing a method for in space scale representation.
Continuous Wavelet Transform (CWT) and Support Vector Machine (SVM) are used
for classifying the gender [20] from the facial images and compared with DWT,
RADON along with SVM.
However, the existing gender recognition methods do not consider most important
multi view problem and focus on frontal view only. There are some few works on
aligned faces [15-17]. Another important problem of gender detection is illumination
changes. To overcome the limitations of the existing methods, fusion technology is
used to combine different types of features obtained from different face images. The
proposed scheme motives to develop a face based gender recognition algorithm to
deal with the existing problems.
The rest of the paper is organized as follows. In Section 2, the proposed method is
described while feature extraction algorithms are illustrated in Section 3.
Experimental results are discussed in Section 4 and finally concluding remarks are
summarized in Section 5.
2 Proposed Method
The block diagram of the proposed method for gender recognition is shown in Fig. 1
consisting of five main modules: face detection, pre-processing, feature extraction,
fusion and classification.
2.2 Preprocessing
After face detection, the feature area is extracted by cropping the face image 15%
from right and left and 20% from top of the image for removing ears and hairs.
(a) (b)
Fig. 2. Results of (a) face detection by Viola and Jones technique (b) preprocessing
Then in the next step Gaussian smoothing filter is applied and image is resized. At
the last step of preprocessing, histogram equalization is performed to overcome the
illumination differences. Fig. 2(b) represents the preprocessed face image from its
detected face image.
Both spatial and temporal features are obtained from the preprocessed face images for
gender recognition. The features are fused to form a compact feature representation of
the facial images.
1 M
Ψ= Γi
M i =1
(1)
Step1: Apply 2D-DWT on each of the resized face images using db1 mother wavelet.
Step2: After 1st level decomposition, extract only the approximation coefficients
(CA1) for each test images
Step3: Perform the operations from Step2 to Step7 of spatial feature extraction
techniques, on the CA1 of test images to extract the temporal features.
Fig. 4. Information flow diagram for the fusion scheme employing PCA
4 Experimental Studies
The first step of any image classification technique is to representing the face images
in terms of input-output feature vectors. In the present work, Neural Network(NN),
114 S. Biswas and J. Sil
Stochastic gradient descent(SGD), Naive Bayes, Simple logistic and SVM method are
used for classification of images as male or female.
To demonstrate the effectiveness of the proposed algorithm, it is applied on
FERET face image databases.
The FERET database contains frontal, left or right profile images and could have
some variations in pose, expression and lightning. In the experiments, we have used
frontal, aligned, pose variant and different expression face images. Some of the test
images from FERET database are shown in Fig. 5.
Classification accuracy using the fused feature vectors are obtained from 200 test
images. By applying ten-fold cross validation technique and using Neural network,
SGD, Naive Bayes and SVM classifier, classification accuracy is computed and
presented in Table 1 for 45 Eigen vectors (P). We have investigated the accuracy of
gender classification by increasing the numbers of Eigenvectors up to 60 and
observed that 45 Eigenvectors gives the best result.
Gender Recognition Using Fusion of Spatial and Temporal Features 115
5 Conclusions
In this paper, we have presented a fusion based gender detection scheme.
Classification accuracy shows effectiveness of the proposed feature extraction
methods. Experimental result shows that fused features improved the classification
accuracy. This technique also can be used for the gender detection of aligned faces.
Also it has been concluded that fusion using wavelets showed better performance.
References
1. Vending machines recommend based on face recognition. Biometric Technology Today
2011(1) (2011)
2. Ylioinas, J., Hadid, A., Pietikäinen, M.: Combining contrast information and local binary
patterns for gender classification. Image Analysis, 676–686 (2011)
3. Shan, C.: Learning local binary patterns for gender classification on real-world face
images. Pattern Recognition Letters 33(4), 431–437 (2012)
4. Wu, T.-X., Lian, X.-C., Lu, B.-L.: Multi-view gender classification using symmetry of
facial images. Neural Computing and Applications, 1–9 (2011)
5. Wang, J.G., Li, J., Lee, C.Y., Yau, W.Y.: Dense SIFT and Gabor descriptors-based face
representation with applications to gender recognition. In: 2010 11th International
Conference on Control Automation Robotics & Vision (ICARCV), pp. 1860–1864 (2010)
6. Lee, P.H., Hung, J.Y., Hung, Y.P.: Automatic Gender Recognition Using Fusion of Facial
Strips. In: 20th International Conference on Pattern Recognition (ICPR), pp. 1140–1143
(2010)
116 S. Biswas and J. Sil
7. Rai, P., Khanna, P.: Gender classification using Radon and Wavelet Transforms. In:
International Conference on Industrial and Information Systems, pp. 448–451 (2010)
8. Shiqi, Y., Tieniu, T., Kaiqi, H., Kui, J., Xinyu, W.: A Study on Gait- Based Gender
Classification. IEEE Transactions Image Processing 18, 1905–1910 (2009)
9. Hu, M., Wang, Y., Zhang, Z., Wang, Y.: Combining Spatial and Temporal Information for
Gait Based Gender Classification. In: 20th International Conference on Pattern
Recognition, pp. 3679–3682 (2010)
10. Chang, P.-C., Tien, M.-C., Wu, J.-L., Hu, C.-S.: Real-time Gender Classification from
Human Gait for Arbitrary View Angles. In: 2009 11th IEEE International Symposium on
Multimedia, pp. 88–95 (2009)
11. Chang, C.Y., Wu, T.H.: Using gait information for gender recognition. In: 10th
International Conference on Intelligent Systems Design and Applications, pp. 1388–1393
(2010)
12. Amayeh, G., Bebis, G., Nicolescu, M.: Gender classification from hand shape. In: Proc.
IEEE Conference on Computer Vision and Pattern
13. Jain, A.K., Ross, A., Prabhakar, S.: An Introduction to Biometric Recognition. IEEE
Transactions on Circuits and Systems 14(1), 4–20 (2004)
14. Sun, Z., Bebis, G., Yuan, X., Louis, S.J.: Genetic feature subset selection for gender
classification: A comparison study. In: Proceedings of IEEE Workshop on Applications of
Computer Vision, pp. 165–170 (2002)
15. Wu, T.-X., Lian, X.-C., Lu, B.-L.: Multi View gender classification using symmetry of
facial image. In: ICONIP (2010)
16. Huang, J., Shao, X., Wechsler, H.: Face pose discrimination using support vector machines
(svm). In: Proceedings of 14th International Conference on Pattern Recognition, ICPR
1998 (1998)
17. Raisamo, M.E.: Evaluation of gender classification methods with automatically detected
and aligned faces. IEEE Trans. Pattern Analysis and Machine Intelligence 30(3), 541–547
(2008)
18. Somayeh, B., Mousavi, H.A.: Gender classification using neuro fuzzy system. Indian
Journal of Science and Technology 4(10) (2011)
19. Basha, A.F., Shaira, G., Jahangeer, B.: Face gender image classification using various
wavelet transform and support vector machine with various Kernels. IJCSI International
Journal of Computer Science Issues 9(6), 2 (2012)
20. Ullah, I., Hussain, M., Aboalsamh, H., Muhammad, G., Mirza, A.M., Bebis, G.: Gender
Recognition from Face Images with Dyadic Wavelet Transform and Local Binary Pattern.
In: Bebis, G., Boyle, R., Parvin, B., Koracin, D., Fowlkes, C., Wang, S., Choi, M.-H.,
Mantler, S., Schulze, J., Acevedo, D., Mueller, K., Papka, M. (eds.) ISVC 2012, Part II.
LNCS, vol. 7432, pp. 409–419. Springer, Heidelberg (2012)
21. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In:
CVPR (2001)
22. Caltech Image Database,
http://www.vision.caltech.edu/htmlfiles/archive.html
23. Leng, X.M., Wang, Y.: Improving Generalization For Gender Classification. In: ICIP
(2008)
24. Makinen, E., Raisamo, R.: Evaluation of Gender Classification Methods with
Automatically Detected and Aligned Faces. IEEE Transactions on PAMI 30(3) (2008)
25. Akbari, R., Mozaffari, S.: Performance Enhancement of PCA-based Face Recognition
System via Gender Classification Method. In: MVIP (2010)
26. Xu, Z., Lu, L., Shi, P.F.: A hybrid approach to gender classification from face images. In:
ICPR, pp. 1–4 (2008)
The Use of Artificial Intelligence Tools in the Detection of
Cancer Cervix
Lamia Guesmi1, Omelkir Boughzala1, Lotfi Nabli2, and Mohamed Hedi Bedoui1
1
Laboratoire Technology and Medical Imaging (TIM), Faculty of Medicine, Monastir,
Ecole National Engineering of Monastir Box 5000
Monastir, Tunisia
2
Laboratoire of Automation and Computer Engineering (Lille LAIL),
(CNRS UPRESA is 8021),
Ecole National Engineering of Monastir Box 5000,
Monastir, Tunisia
[email protected], [email protected],
[email protected], [email protected]
1 Introduction
In Pathological Anatomy and Cytology, we distinguish two types of tests: The histol-
ogy is the observation of the cutting of tissue and cytology is the examination of a
spreading cell. We are particularly interested in cytology. The samples are spread on a
slide and then fixed and stained to recognize the different cells present. The smears
are then examined under a microscope by a cyto-technician to identify cells of inter-
est. This step of reading the blade consists of a visual assessment of these cells onto a
cytological blade. The purpose of this step is either the detection of abnormal or
suspicious cells, namely the quantification of cells. This is of vital interest to the pa-
thologist who must establish a reliable and valid diagnosis especially in the case of
classification CSVs for the diagnosis of cancer cervix. That is why we introduced
artificial intelligence tools to facilitate this task with a very high success rate based on
the technical supervisor of human and automatic after illustrating a priori information
used to recognize cells are size, shape, texture but also and mainly the color.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 117
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_14, © Springer International Publishing Switzerland 2014
118 L. Guesmi et al.
2 Segmentation of CSV
In order to clean CSVs of inflammatory cells that invaded the image to the Treaty
(Fig. 1), in which these cells recognize the same tent staining than nuclei of cells
tested is why we spent at gray image to properly locate and illustrate the inflammato-
ry cells using applications programming software MATLAB.
Fig. 1. Vaginal Smear: the background is inflammatory and contains blue parasites. The
squamous cells show clear perinuclear halo (Papanicolaou staining)
We turn to the gray level of the smear to the elimination of certain inflammatory
cells that exhibit a level of RGB (red - green - blue) is not perfectly zero as the case of
carcinogenic cell nucleus. We note that the cleaning of the smear is not clear that
essentially means the occupation of certain sensitive places of inflammatory cells into
the cell nucleus as a carcinogen or the borders of the cytoplasm of the cells in ques-
tion, all this may give false decisions. And to avoid confusion, it is recommended to
make a levy as sharp as possible by the cyto patologist, so do the cleaning cell by cell.
We turn to the gray level of the smear to the elimination of certain inflammatory
cells that exhibit a level of RGB (red - green - blue) is not perfectly zero as the case of
carcinogenic cell nucleus. We note that the cleaning of the smear is not clear that
essentially means the occupation of certain sensitive places of inflammatory cells into
the cell nucleus as a carcinogen or the borders of the cytoplasm of the cells in ques-
tion, all this may give false decisions. And to avoid confusion, it is recommended to
make a levy as sharp as possible by the cyto patologist, so do the cleaning cell by cell.
separate the whole cell from other cells that may appear on the edges of the image,
that is to say, incomplete cells. Here are enumerated the steps required to segment
such an image: Playing the picture; Detection internal and external contours of a cell
linearization outlines of objects; Dilatation contours of objects filling internal objects;
Removing objects exceeding the edge of the image, and the softening of the object
and segmentation (outer contour only). We ended up following a CSV segmented
(Fig. 2):
Fig. 2. Vaginal smear segmented: we can determine the overall shape of the cell carcinogenic
and size of the nucleus and the cytoplasm (X, Y, index : punctual coordinates)
3 Shape Parameters
U1 = SN1/SC1. (5)
120 L. Guesmi et al.
U2 = SN2/SC2. (6)
SC1/SC2. (7)
SN1/SN2. (8)
With: DFMax: Maximum Feret diameter; DFMin: Minimal Feret diameter; SN1:
Size of the smallest circle included the nucleus; SN2: Size of the largest circle cir-
cumscribing the nucleus SC1: Size of the smallest circle including the cytoplasm and
SC2: Size of the largest circle circumscribing the cytoplasm.
In our case, we are interested diagnosis-based models and database of a hybrid system
because our support, equal to 120 CSVs distributed in equal shares to four classes of
cancer (Cancer ( C), High Grade Dysplasia (HGD), Low Grade Dysplasia (LGD) and
Normal (N)). These smears will be processed and exploited, in order to extract
qualitative and quantitative information. These will form the basis of data for the
development and testing of the tool to develop diagnostic techniques based on Artifi-
cial Intelligence (AI).
Neural networks have several types; the most used is the MP which is characterized
by its learning algorithm and back propagation of errors [1]. The MP is an artificial
neural network-oriented organized in layers, where information flows in one direc-
tion, from input layer to output layer. The input layer is always a virtual layer contain-
ing the entries of the system [2]. The following layers are the hidden layers that admit
"m" number of layers according to the need for resolution of the system. It ends with
the output layer representing the results of classification system [3]. In the case sug-
gested, the MP consists of an input layer which consists of eight shape parameters that
are specified above, a hidden layer which consists of 26 neurons and an output layer
is composed of four neurons formant the four main grades of cancer cervix. This type
of perceptron is chosen after several tests, in which by varying the number of neurons
in the hidden layer, the learning period and the vector normalization. We used the
activity as a function of "Log-Sigmoid" with the normalization vector is V = [0.01,
0.99], because the database is formed by values that are between 0 and 1.
The Use of Artificial Intelligence Tools in the Detection of Cancer Cervix 121
The construction of a fuzzy model for the diagnosis and classification is to move
mainly by three main stages: the fuzzification, the inference and deffuzzification. To
achieve these three steps have to be set membership functions of input variables and
those of output.
Definition of membership functions of input and output variables:The membership
functions are most commonly used form: Singleton, Triangular, Trapezoidal, Gaus-
sian. We retained the trapezoid for their simplicity of coding and manipulation, and it
includes the range of variation of shape parameters. The number of trapezoidal shape
in the representation of the membership function will be limited as much as possible
for four to describe the scope of a variable. In this case they will be associated with
terms: low - medium - large-very important.
We associate a variable (Z) the following values of Table 1: minimum (Zmin), op-
timistic (ZTopt), optimistic (Zopt) average optimism (ZmoyO) average pessimistic
(ZmoyP), pessimistic (Zpes), very pessimistic (Ztpes ) and maximum (Zmax).
Variable (Z)
Zmin ZTopt Zopt ZmoyO ZmoyP Zpes Ztpes Zmax
Membership functions of input parameters
E1 0.000 0.134 0.265 0.407 0,517 0,668 0,881 1.000
E2 0.000 0.186 0,339 0,429 0,609 0,611 0,998 1.000
DF1 0.000 0.383 0.391 0.484 0.833 0.936 0.960 1.000
DF2 0.000 0.403 0.493 0.519 0.903 0.977 0.983 1.000
U1 0.000 0.019 0.081 0.174 0.176 0.622 0.963 1.000
Rules of Inference or Inference: This block is composed by all the fuzzy rules that
exist between input variables and output variables expressed both as linguistic. We
have chosen as a method of inference method "MAX-MIN". It is based on the use of
two logical operations: the "AND" logic associated with the minimum and the "OR"
logic associated to the maximum. These rules are of the form: IF (AND ... ....) THEN
(decision) OR .... [5-7]. We have 185 possible rules of inference in the form of the
maximum number of rules.
5 Simulation Results
The results of the use of an intelligent screening for cancer cervix in the three
recurrent artificial intelligence techniques are recapitalized in Table 2 after which it
appears the success rate each technique for the same database (the 120 cases).
The Use of Artificial Intelligence Tools in the Detection of Cancer Cervix 123
6 Discussion
Furthermore, some data parameters are not clearly defined which are important for the
evaluation of the general behavior of each method.
Table 3. Comparison of the proposed method and other methods appeared in the literature
The success rate obtained by the three approaches of artificial intelligence is very
important for all cases of neuro-fuzzy can be improved if we increase our database
used. The method of current areas of neuro-fuzzy confusion between two classes that
are close to the formal point of view of the cell is carcinogenic for couples (HGD, C)
and (LGD, N). To remedy this, we can add another shape parameter that expresses
aspecific quality indicator or as previously mentioned we develop our database is the
goal our next study we will increase our database up to 180 CSVs (minimum) and we
will set a new quality setting for better qualify our classification. Despite these false
positives, this method has the refuge of screening cervical cancer at the moment.
Beyond the comparison of our method with pixel classification schemes, Table 3
shows a comparison of our method and other methods appeared in the literature. In
general, it is difficult to compare the methods directly since many of them do not
include quantitative results and the performance criteria extensively vary.
124 L. Guesmi et al.
7 Conclusion
The supervisor directed by the hybrid approach neuro - fuzzy won the record classifi-
cation CSVs a success rate of around 94% which we can improve if we increase our
data down by other cases of injuries precancerous and cancerous encountered.
References
1. Buniet, L.: Traitement automatique de la parole en milieu bruité: étude de modèles con-
nexionnistes statiques et dynamiques. Thèse doctorat. Université Henri Poincaré - Nancy
1. Spécialité informatique, 40–39 (1997)
2. El Zoghbi, M.: Analyse électromagnétique et outils de modélisation couplés. Application à
la conception hybride de composants et modules hyperfréquences. Thèse doctorat. Univer-
sité de Limoges. Discipline: Electronique des Hautes fréquences et Optoélectronique.
Spécialité: Communications Optiques et Microondes, 37 (2008)
3. Lamine, H.M.: Conception d’un capteur de pression intelligent. Mastère en micro électri-
que. Option IC Design. Université de Batna: faculté des sciences de l’ingénieur, 18 (2005)
4. Millot, P.: Systèmes Homme – Machine et Automatique. Université de Valenciennes et du
Hainaut – Cambrésis. Laboratoire : LAMIH CNRS. In: Journées Doctorales
d’Automatique JDA 1999, Conférence Plénière, Nancy, pp. 23–21 (1999)
5. Nabli, L.: Surveillance Préventive Conditionnelle Prévisionnelle Indirecte d’une Unité de
Filature Textile : Approche par la Qualité. Thèse Doctorat. Discipline: Productique, Auto-
matique et Informatique Industrielle Université des Sciences et Technologies de Lille, 34 –
20 (2000)
6. Cimuca, G.-O.: Système inertiel de stockage d’énergie associe à des générateurs éoliens.
Thèse doctorat. Spécialité: Electrique. Ecole nationale supérieure d’arts et métiers centre
de Lille, 162– 163 (2005)
7. Baghli, L.: Contribution à la commande de la machine asynchrone, utilisation de la logique
floue, des réseaux de neurones et des algorithmes génétiques. Thèse doctorat. UFR
Sciences et Techniques: STMIA. Université Henri Poincaré, Nancy-I, 16 (1999)
8. Främling, K.: Les réseaux de neurones comme outils d’aide à la décision floue. Rapport de
D.E.A. Spécialité : informatique. Equipe Ingénierie de l’Environnement.Ecole Nationale
Supérieure des Mines de Saint-Etienne. Juillet, 12-11 (1992)
9. Chankonga, T., Heera-Umpon, N., Auephanwiriyakul, S.: Automatic cervical cell segmen-
tation andclassification in Pap smears (2013)
10. IssacNiwas, S., Palanisamy, P., Sujathan, K., Bengtsson, E.: Analysis of nuclei textures of
fine needle aspirated cytology images for breast cancer diagnosis using Complex Daube-
chies wavelets13 (2012)
11. Plissiti, M.E., Nikou, C., Charchanti, A.: Combining shape, texture and intensity features
for cell nuclei extraction in Pap smear images
12. Lezoray, O., Cardot, H.: Cooperation of Color Pixel ClassificationSchemes and Color Wa-
tershed: A Study for Microscopic Images
A Scalable Feature Selection Algorithm for Large
Datasets – Quick Branch & Bound Iterative (QBB-I)
1 Introduction
In today’s information age, it is easy to accumulate data and inexpensive to store it.
The ability to understand and analyze large data requires us to more efficiently find
the optimal subset of features. Feature selection is the task of searching for an optimal
subset of features from all available features [15]. Feature selection aims to find the
optimal subset of features by removing redundant and irrelevant features that
contribute noise and to reduce the computational complexity.
Existing feature selection algorithms (FSA) for machine learning typically fall into
two broad categories: wrappers and filters [13-15]. Filter methods choose the best
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 125
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_15, © Springer International Publishing Switzerland 2014
126 P. Nedungadi and M.S. Remya
The two important aspects of a FSA are the search strategies to find the subset of
features and the evaluation measure used to find the goodness of feature subsets. The
search strategies employed by FSA include the exhaustive, complete, heuristic and
random search. The search process may start with the empty set and increase by one
feature at a time (forward search) or start with the full set and drop one feature at a
time (backward search). The search may also start with a randomly generated feature
set and change either probabilistically or deterministically. The evaluation measures
include distance measures, information gain, correlation measures, consistency, and
classifier error rate.
The algorithms used in this paper use the consistency measure. The inconsistency
rate of a data is defined as follows [6]: (a) if the two patterns match all but their class
labels, then they are considered inconsistent. (b) The inconsistency count is the total
matching patterns minus largest number of patterns of different class label. For
example, if there are n matching patterns, among them, c1 patterns belongs to label1,
c2 to label2, c3 to label3 where n = c1 + c2 + c3. If c3 is largest among three, then
inconsistency count = n - c3. (c) Inconsistency rate is equal to sum of all
inconsistency counts divided by total number of patterns.
The QBB Algorithm [16] is a hybrid of two popular algorithms, the LVF [6] and
the ABB [5] algorithms and exploits the advantage and disadvantage of both
algorithms. LVF is probabilistic search algorithm which in the initial phase quickly
reduces the number of features to a subset as initially many subsets can satisfy the
consistency measure. However, after a certain point, fewer subsets can satisfy this and
thus increase the computational complexity as it spends resource on randomly
generating subsets that are obviously not good.
ABB starts with the full feature set and uses the inconsistency measure as the
bound. It reduces the feature set by one feature at a time to generate subsets and is a
complete search that guarantees optimal subset. However, its performance
deteriorates when the difference between the total number of features and the optimal
subset of features is large. QBB was proposed to exploit the advantages and overcome
the disadvantages of both LVF and ABB. LVF quickly reduces the size of the initial
128 P. Nedungadi and M.S. Remya
feature set to a consistent subset and then uses this smaller feature set as the input to
ABB to find the optimal feature subset.
In a survey and experimental evaluation of feature selection algorithms [19],
Molina et al. review different fundamental feature selection algorithms and assesses
their performances. They evaluate and compare FSA such as LVF [6], ABB [5], QBB
[16], FOCUS [1], RELIEF [11], and LVI [17] in order to understand the general
behavior of FSAs on the particularities of relevance, irrelevance, redundancy and
sample size of synthetic data sets and show that overall hybrid algorithms perform
better than individual algorithms.
In [23], the authors present the activity recommendation and compare QBB with
different algorithms and show that QBB has the best results. QBB is reasonably fast,
robust and handle features which are inter-dependent but does not work well with
large datasets [17].
We propose the Quick Branch and Bound Iterative (QBB-I) algorithm so as to retain
the advantages of QBB while reducing the time complexity. QBB-I divides the data to
two parts, D0 (p %) and D1 (1-p %) and finds a subset of features for D1 using the
consistency measure. It is designed such that the features selected from the reduced
data will not generate more inconsistencies with the whole data.
QBB-I finds a subset of features for D0 and checks this with D1 using the
consistency measure. If the inconsistency rate exceeds the threshold, it appends the
patterns from D1 which cause the inconsistency to D0 and deletes them from D1. This
selection processes repeats until a solution is found. If no subset is found, the
complete set of attributes is returned as a solution.
Table 1 describes the type of Search Organization, Generation of Successors and
Evaluation Measure used by the four algorithms. A search algorithm takes
responsibility for driving the feature selection process.
We want to verify a) QBB-I run faster than QBB on large datasets, b) the feature set
returned by QBB-I is identical to QBB, c) QBB-I is faster than LVF, ABB, and QBB
in general and d ) QBB-I is particularly suitable for huge datasets.
Our experiments compare QBB-I with other baseline existing algorithms such as
LVF, LVI and QBB in terms of execution time and selected features. All algorithms
use the consistency measure with the inconsistency rate set to 0 as there is no prior
knowledge about the data. Using eight different datasets, we find that QBB-I takes
less execution time compared to other algorithms. Our tests with ASSISTments
intelligent tutoring dataset using 150,000 log data and other standard datasets show
that QBB-I is significantly more efficient than QBB while selecting the same set of
optimal features.
130 P. Nedungadi and M.S. Remya
For testing the proposed approached, we use the Cognitive Tutor dataset, from the
Knowledge Discovery and Data Mining Challenge 2010 KDD Cup 2010 [12] from
two different tutoring systems from multiple schools over multiple school years. The
dataset contains 19 attributes. We used the Challenge dataset for this work. These
datasets are quite large which contains 3,310 students with 9,426,966 steps [12].
There are many technical challenges such as the data matrix is sparse, there is a strong
temporal dimension to the data and the problem a given student sees is determined in
part by student choices or past success history.
A total of seven datasets, both artificial and real, are chosen for the experiments
from the UC Irvine data repository [4]. The Lymphography dataset [3] contains 148
instances and the test set is selected randomly with 50 instances. It contains 19
attributes. The lung-cancer dataset [8] describes two types of pathological lung
cancers and contains 32 instances and 57 attributes. Mushroom dataset [22] contains
8124 instances and the dataset has 22 discrete attributes. Parity5+5 dataset [10], the
concept is the parity of five bits. The dataset contains 10 features, of which five are
uniformly random ones.
Splice junctions [20] are points on a DNA sequence at which superfluous DNA is
removed during the process of protein creation in higher organisms. Led24 dataset [2]
has a total of 3200 instances, of which 3000 are selected for testing. It contains 24
attributes. All attribute values are either 0 or 1, according to whether the
corresponding light is on or not for the decimal digit.
QBB and QBB-I were implemented using the C language. In all algorithms, the
inconsistency rate is taken as zero. And, the partition percentage used in QBB-I is
27% and that of LVI is 24%. The experiment is done with over 150,000 log data from
the ASSISTments dataset.
3.3 Results
A total of seven datasets, both artificial and real, are chosen for the experiments from
the UC Irvine data repository to check the effectiveness of QBB-I. These datasets are
either commonly used for testing feature selection algorithms or artificially designed
so that relevant attributes are known. Fig. 2 shows the graphical representation for the
comparison of LVF, QBB and QBB-I for different datasets. For the first three dataset,
there is no much reduction in the execution time due to small set of data. But for
parity and KDD datasets, it is much clearer. So from this analysis, we can conclude
that QBB-I algorithm is faster than the other algorithms for larger datasets and less
beneficial for smaller datasets.
A Scalable Feature Selection Algorithm for Large Datasets 131
We then compare the features selected by LVF, QBB and QBB-I. This experiment
is mainly focused on checking the accuracy of our Enhanced algorithm. Table 2
shows the features selected by three algorithms. Table 2 shows that the features
selected by QBB-I are identical to that of QBB while the features selected by LVI is
identical to that of LVF.
We also compare the execution time of the four algorithms (Fig. 3).The result
shows that the QBB-I takes less execution time compared to all other algorithms.
Among the three existing algorithms, LVI was the fastest while QBB gave the
better feature set. Our QBB-I algorithm, which uses the scalability concepts from
LVI to scale the QBB algorithm, returns the same feature set as QBB and is faster
than LVI.
132 P. Nedungadi and M.S. Remya
For the ASSISTments dataset, Yu. Et al[9] manually identify useful combinations
of features and then experimentally shows that some of the feature combinations
effectively improve the Root Mean Squared Error (RMSE). While the previous
experiments were with individual features, we also compare the features selected by
QBB and QBB-I with the following combinations of features suggested by this paper.
(Student Name, Unit Name), (Unit Name, Section Name), (Section Name, Problem
Name), (problem Name, step Name), (Student Name, Unit Name, Section Name),
(Section Name, Problem Name, Step Name), (Student Name, Unit Name, Section
Name, Problem Name) and (Unit Name, Section Name, Problem Name, Step Name).
Both QBB and QBB-I selected the last four combinations with a subset of the
ASSISTments dataset consisting of 150,000 log records, which have previously been
shown to be the best combinations [9]. Fig. 4 shows the comparison of the time taken
by QBB and QBB-I for selecting the combinations for 80,000, 100,000 and 150,000
log records. The result shows that QBB-I takes less time for execution compared to
QBB.
Finally we vary the partition size (Refer Fig. 5) as we expect that as the partition
size increases; the execution time reduces and then becomes a constant. We find that
for 80000 log records of the ASSISTments dataset, this is reached with a partition size
of around 25% while with 100,000 and 150,000 log records of the same dataset, this
is reached with a partition size of 27%.
4 Conclusion
Acknowledgments. This work derives direction and inspiration from the Chancellor
of Amrita University, Sri Mata Amritanandamayi Devi. We thank Dr. M
Ramachandra Kaimal, head of Computer Science Department, Amrita University for
his valuable feedback.
References
1. Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: Proceedings
of the 9th National Conference on Artificial Intelligence (1991)
2. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression
Trees. Wadsworth International Group, Belmont (1984)
3. Cestnik, G., Konenenko, I., Bratko, I.: Assistant-86: A Knowledge- Elicitation Tool for
Sophisticated Users. In: Progress in Machine Learning, pp. 31–45. Sigma Press (1987)
4. Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Batabases. University of
California, Department of Information and Computer Science, Irvine (1996),
http://www.ics.uci.edu/mlearn/MLRepository.html
5. Liu, H., Motoda, H., Dash, M.: A monotonic measure for Optimal Feature Selection. In:
Proceedings of the European Conference on Machine Learning, pp. 101–106 (1998)
6. Liu, H., Setiono, R.: A probabilistic approach to feature selection: a Filter solution. In:
Proceedings of the 13th International Conference on Machine Learning, pp. 319–327
(1996)
7. Liu, H., Setiono, R.: Scalable feature selection for large sized databases. In: Proceedings of
the 4th World Congress on Expert System, p. 6875 (1998)
8. Hong, Z.Q., Yang, J.Y.: Optimal Discriminant Plane for a Small Number of Samples and
Design Method of Classifier on the Plane. Pattern Recognition 24, 317–324 (1991)
9. Yu, H.-F., Lo, H.-Y., Hsieh, H.-P.: Feature Engineering and Classifier Ensemble for KDD
Cup 2010. In: JMLR: Workshop and Conference Proceedings, vol. 1, pp. 1–16 (2010)
10. John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem.
In: Proceedings of the Eleventh International Conference in Machine Learning (1994)
11. Kira, K., Rendell, L.: A practical approach to feature selection. In: Proceedings of the 9th
International Conference on Machine Learning, pp. 249–256 (1992)
12. Koedinger, K., Baker, R., Cunningham, K., Skogsholm, A., Leber, B., Stamper, J.: A data
repository for the EDM community: the pslc datashop (2010)
13. Kohavi, K.: Wrappers for performance enhancement and oblivious decision graphs. Phd
thesis, Stanford university (1995)
14. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial
Intelligence 97(12), 273–324 (1996)
15. Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1(1-4),
131–156 (1997)
16. Dash, M., Liu, H.: Hybrid search of feature subsets. In: Lee, H.-Y., Motoda, H. (eds.)
PRICAI 1998. LNCS, vol. 1531, pp. 238–249. Springer, Heidelberg (1998)
17. Dash, M., Liu, H., Motoda, H.: Feature Selection Using Consistency Measure. In:
Arikawa, S., Nakata, I. (eds.) DS 1999. LNCS (LNAI), vol. 1721, pp. 319–320. Springer,
Heidelberg (1999)
18. Kudo, M., Sklansky, J.: A Comparative Evaluation of medium and largescale Feature
Selectors for Pattern Classifiers. In: Proceedings of the 1st International Workshop on
Statistical Techniques in Pattern Recognition, pp. 91–96 (1997)
A Scalable Feature Selection Algorithm for Large Datasets 135
19. Molina, L.P., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and
experimental evaluation. Universitat Politcnica de catalunya. departament de llenguatges i
sistemes informtics (2002)
20. Noordewier, M.O., Towell, G.G., Shavlik, J.W.: Training Knowledge-Based Neural
Networks to Recognize Genes in DNA Sequences. In: Advances in Neural Information
Processing Systems, vol. 3 (1991)
21. Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in
bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
22. Schlimmer, J.S.: Concept Acquisition Through Representational Adjustment (Technical
Report 87-19). Doctoral disseration, Department of Information and Computer Science,
University of California, Irvine (1987)
23. Nguyen, T.: A Group Activity Recommendation Scheme Based on Situation Distance and
User Similarity Clustering. M. Thesis, Department of Computer Science KAIST (2012)
Towards a Scalable Approach for Mining Frequent
Patterns from the Linked Open Data Cloud
Abstract. In recent years, the linked data principles have become one of the
prominent ways to interlink and publish datasets on the web creating the web
space a big data store. With the data published in RDF form and available as
open data on the web opens up a new dimension to discover knowledge from
the heterogeneous sources. The major problem with the linked open data is the
heterogeneity and the massive volume along with the preprocessing require-
ments for its consumption. The massive volume also constraint the high memo-
ry dependencies of the data structures required for methods in the mining
process in addition to the mining process overheads. This paper proposes to ex-
tract and store the RDF dumps available for the source data from the linked
open data cloud which can be further retrieved and put in a format for mining
and then suggests the applicability of an efficient method to generate frequent
patterns from these huge volumes of data without any constraint of the memory
requirement.
Keywords: Linked Data Mining, Data Mining, Semantic Web data Mining,
RDF data mining.
1 Introduction
In recent years, the linked data principles [10] have become one of the prominent
ways to interlink and publish datasets on the web creating the web space a big data
store. With the data published in RDF form and available as open data on the web
opens up a new dimension to produce knowledge from these data, called semantic
web data. Linked Open data cloud [10] is a heterogeneous source of semantic web
data, contains information from diverse fields like music, medicine & drugs, publica-
tions, people, geography etc. and most often the large government data represented in
the RDF triple structure consisting of Subject, Predicate and Object (SPO). Each
statement represents a fact and expresses a relation (represented by the predicate re-
source) between the subject and the object. Formally, the Subject and the Predicate
resource are represented by a URI and the object by a URI or a literal such as a num-
ber or string; the Subject and Object being interconnected by predicates. The inter-
connection between the statements harbor many hidden relationships and can lead to
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 137
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_16, © Springer International Publishing Switzerland 2014
138 R. Mahule and O.P. Vyas
insights about the data and domain. Resources in the semantic web datasets are con-
nected by multiple predicates, co-occurring in multiple implicit relations. The co-
occurrence and the frequencies of triples in the datasets are a matter of investigation
and give a pursuit for mining such SPO data such as Association rule mining [1].
Association rule mining has been widely studied in the context of basket analysis and
sale recommendations [2]. In fact, association rules can be discovered in any domain,
with many items or events among which interesting relationships can be discovered
from co-occurrence of those item or events in the existing subsets (transactions). The
discovered association rules may have many applications such as the enhancement of
information source recommendation in the semantic web, association rule based clas-
sification and clustering of semantic web resources etc.
In this paper an approach on mining association rules based on the SPO view of
RDF data from the linked open data cloud has been proposed using the Co-
Occurrence Frequent Item Tree (COFI-tree) algorithm [3]. The advantages of using
COFI tree for mining semantic web data over other methods (including FP-tree) is
that “the algorithm builds a small tree called COFI-tree for each frequent 1-itemset
and mine the trees with simple non-recursive traversals and only one COFI-tree re-
sides in memory at one-time and it is discarded as soon as it is mined to make room
for the next COFI-tree and hence dramatically reduces the requirement of the memory
space so as to handle millions of transactions with hundreds of thousands of dimen-
sion” [3].
To illustrate the process of mining association rules from semantic web data using
COFI-Tree mining algorithm, we integrate the publication data collected from different
sources like DBLP [17], ACM Digital library [16] and Cite Seer [18] available in the
publication domain of the linked open data cloud as downloadable repositories of RDF
dump. We confine our study to identify association between the authors of the academic
network for the identification of frequently collaborating co-authors in a particular do-
main so as to get an idea of the collaborative efforts made by the authors; as the present
domain is confined only to the publication count of individual members.
The rest of the paper is structured as follows. Section 2 discusses related work and
briefly describes the basics of association rule generation. Section 3 details the min-
ing process using COFI-Tree algorithm with an example for illustration. Section 4
depicts the experimental work and evaluation and finally section 5 concludes the pa-
per with a look on the work for the future.
2 Related Work
In the past couple of years, data mining has been an area of active researches in both
the traditional datasets and the semantic datasets. In the semantic web datasets, there
are approaches to mine association rules, logical rules, generating schema or taxono-
mies for the knowledge base etc. for application purposes. However, the majority of
work in the arena of semantic web focuses on clustering and classification [4-5], there
is also some work on Inductive Logic Programming (ILP) based on logics from
ontology [6].
Towards a Scalable Approach for Mining Frequent Patterns 139
Association Rule mining (ARM) [8], mines frequent patterns that appear together
on a list of transactions, where a transaction is set of items. For example in the context
of market basket analysis or sales analysis, “a transaction is the set of items purchased
together by a customer in a specific event”. The mined rules represented in the form
{bread, butter} → milk, meaning that people who bought bread and butter also
bought milk. Many ARM algorithms have been proposed so far which works well in
traditional datasets and can be classified into two types: Apriori based, and FP-tree
based.
The apriori algorithm which serves as the base for most of the algorithms uses an
anti-monotone property stating that “for a K-itemset to be frequent, all its (k-1) item-
sets should also have to be frequent; thereby reducing the computational cost of can-
didate frequent itemset generation”. But, in case of very large datasets having big
frequent 1-itemset, this algorithm suffers from two main bottlenecks viz. “repeated
I/O scanning and high computational cost”. Another problem related to this algorithm
is the huge size of candidate frequent 2-itemset and 3-itemset as observed for most
real datasets.
Another approach for discovering frequent patterns in transaction datasets was FP-
Growth [9]. The algorithm solves the multi-scan problem by creating a compact tree
structure, FP-Tree, representing frequent patterns thereby improving the candidate
itemset generation. This algorithm requires two full I/O scans of the dataset and
buildsa prefix tree in main memory, which is then mined. The algorithm performs
faster than the Apriori algorithm. The process of mining in the generated FP-Tree is
performed recursively by creating conditional trees having the same magnitude in
order of number, as the frequent patterns. Due to the huge mass of the created condi-
tional trees, the algorithm is not scalable and hence suitable to mine large datasets.
An improvement to this above is the Co-Occurrence Frequent Item tree (COFI-
tree) Mining algorithm [3] in which the author divides the algorithm in two phases.
Phase I, in which the FP-Tree structure is built with the two full I/O scans of the
transactional database. Phase II, small Co-occurrence Frequent trees for each frequent
item is built which are first pruned, eliminating the non-frequent items in regard to the
COFI-tree based frequent item, followed by the mining process. The most important
advantage here is the pruning technique, which truncates the large memory space
requirement of the COFI-trees.
The above association rule mining algorithms are proposed and well suited for
mining with traditional dataset, i.e. those datasets which are based on transactions and
cannot be applied directly to the semantic web data as the semantic web dataset are of
the form of SPO, they need to be shaped into a form that can be suitable for mining,
i.e. the transactions from the semantic web data has to be generated and the mining
can be performed on those data.
Also, there are approaches for generating association rules from the semantic web
datasets. In [7], an algorithm is proposed that mines association rules from the seman-
tic web dataset by mining patterns which are provided by the end users. The mining
patterns are based on the result of a query language SPARQL, the query language for
the semantic web data. The algorithm mines the association rules in a semi-supervised
manner as per the user provided input patterns.
140 R. Mahule and O.P. Vyas
In one of the recent algorithms [11], proposed by R. Ramezani et al., the author
proposed to mine association rules from a single and centralized semantic web data-
set, in an unsupervised environment and with the approach similar to the apriori me-
thod, as proposed for traditional datasets.
Also, there are similar works of performing datamining on the linked data available
on the LOD cloud. One approach LiDDM [12], the authors applied techniques of data
mining (Clustering, Classification and association rules) on linked data [10] by first
acquiring the data from the LOD datasets using user defined queries, converting the
result into traditional data, applying preprocessing to format the data for mining and
then performing the data mining process. Similar to LIDDM is the Rapid Miner Se-
mantic web plug-in [14], in which the end-user has to provide a suitable SPARQL
query for retrieving desired data from the linked open data cloud. The software then
converts data in a tabular feature format and then the data mining process is carried
out.
The approach to construct COFI tree involves the following steps –“the construction
of the Frequent Pattern Tree, the building of small Co-Occurent Frequent trees for
each frequent items which are then pruned for the elimination of any non-frequent
items in regard to COFI-tree based frequent item, and finally, the mining work is
carried out”.
The work start with the collection of publication data from three sources DBLP
[17], ACM Digital library [16] and CiteSeer [18] available as downloadable RDF
dumps available at datahub.org [15], the authors names and the paper title is toke-
nized and retrieved for each publication available in the dump after parsing the RDF
files individually. The parsing is done using Jean [13] which is available as a Java
API. Each publication title and individual authors retrieved from the RDF/XML files
are assigned a unique ID and are stored in a output text file. A Horizontal data format
as a vector representation with the publication-id (Pi) and author-id (Ai) for some
publications is formed and is as shown in Table -1 for illustration purpose. The output
text file (enumerated file), which is in a table format is then used for the construction
of the frequent pattern tree. The process comprising two modules, each module re-
quiring a complete I/O scan of the table. In module 1, the support for all items in the
transaction table is accumulated (Refer to Fig. 1: Step-1) followed by the removal of
the infrequent authors i.e. authors with a support less than the support threshold(say 4
in this example) (Refer to Fig. 1: Step-2), followed by sorting of the remaining fre-
quent authors according to their frequency to get frequent 1-itemset (Refer to Fig. 1:
Step-3).The list containing this is arranged in a header table, having the authors, their
respective support along with a pointer to the author’s first occurrence in the frequent
pattern tree. The second module constructs a frequent pattern tree.
In module 2, the first transaction (A1, A7, A4, A3, A2) is read and scanned for the
frequent items present in the header table (i.e. A1, A4, A3and A2), which again is
sorted by their authors’ support (A1, A2, A3 and A4) and first path of the FP-Tree is
Towards a Scalable Approach for Mining Frequent Patterns 141
generated using this ordered transaction with an initial item-node support value of 1.
Between each item-node in the tree, a link is established and in the header table, cor-
responding item entry is done. Similar process is performed for the next transaction
(A2, A3, A8, A5 and A4), yielding sorted frequent item list (A2, A3, A4, A5), which
builds the second path for the FP-Tree. Next transaction i.e. Transaction 3 (A2, A4,
A5, A1 and A13) yields the sorted frequent item list (A1, A2, A4,A5) that shares the
same prefix (A1, A2) with the existing path in tree. The support for Item-nodes (A1
and A2) is incremented, setting the support of A1 and A2 with a value of 2 and creat-
ing a new sub-path with the left items on the list (A4, A5) with support value of 1.
The process is repeated for the transactions in Sample Table for Publications and
Authors as shown in Table 1. The final FP-Tree constructed after processing all the
transactions of the table is shown in Fig. 2.
The FP-Tree (Fig. 2) for the example table illustrated above is now used for the
construction of independent small trees called COFI-trees for every frequent header
table item of the FP-Tree. These trees are built independently for each individual item
in the header table in the least frequent order of their occurrence i.e. COFI-tree for
142 R. Mahule and O.P. Vyas
author A6 is created first followed by the building of COFI-trees for A5, A4, A3, A2,
A1.. The COFI-trees so built are mined and are discarded separately as soon as they
are built, releasing the memory for the construction of the next COFI-tree. The
process for the construction of the COFI tree starts first for the author A6 which being
least frequent author of the header table. In the tree for A6, all the authors in the head-
er table which are more frequent than A6 and share transactions with A6 takes part in
building of the tree. The construction starts with the root node containing the author
for which the tree is created i.e. A6 and for each edge in the FP-tree containing author
A6 with other frequent items in the header list which are parent nodes of A6 and are
more frequent than A6, a edge is formed with origin as the root i.e. A6.
Fig. 3. COFI-trees
The support-count of the edge being equal to the support-count of the node for au-
thor A6 in the corresponding edge in FP-Tree. For multiple frequent authors that
share the same node-item, they are put into same branch and a counter for each node-
item of the COFI-tree is incremented. Below Fig. 3 depicts the COFI tree for all the
frequent items of the FP-tree in Fig. 2.
In the above figure (Fig. 3), rectangular nodes represent the nodes in the tree with
author label along with two numeric values representing counters for support-count
and participation-count respectively. The first counter is the counter for the particular
node and the second counter is the counter, initially initialized to zero, which is used
during the mining process. Also in the figure above, there are horizontal and vertical
Towards a Scalable Approach for Mining Frequent Patterns 143
bidirectional links used for pointing next node of same author and for establishing
child-parent-child node-links respectively. The squares in the figure represent the
cells of the header-table, which is sorted listing of the frequent items, used to build the
tree and each list has author-id, frequency and a link pointing to the first node of tree
with similar id as the author-id.
The mining process is performed independently as soon as the COFI-tree for individ-
ual authors are constructed and are discarded before the construction of the next COFI
tree with the target of generating all frequent K-item set for the root-author of the tree.
Prior to the mining process, with the support-count and the participation-count, the can-
didate-frequent patterns are extracted and put in a list for each branch of the tree fol-
lowed by the removal of all non-frequent patterns from each branch. In the example
above using the COFI-Tree algorithm[3], the frequent patterns generation is performed
first from A5-COFI-Tree resulting, the frequent patterns as A5A4:5, A5A2:6 and
A5A4A2:5.The A5-COFI-tree is then removed from the memory and the other COFI-
trees are generated and mined to extract the frequent patterns for the root node-item.
After performing mining process for A5 COFI-tree, A4-COFI tree is built and mined
followed by the creation and mining A3-COFI-tree and A2-COFI-tree generating pat-
terns A5A4:5, A5A2:6, A5A4A2:5, A4A2:8, A4,A1:5, A4A2A1:5 and A2A1:6.
advantages of the method used and its performance it is concluded that the method
will prove to be efficient for scalable amount of data which we propose to experiment
with in the near future. The domain selected by us for performing the experiment is
just a single dimension of the mining activities that can be performed with the data
available from the linked open data cloud. With the heterogeneity and the right selec-
tion of inter related domains varied dimensions of knowledge can be discovered with
this type of approach.
References
1. Abedjan, Z., Naumann, F.: Context and Target Configurations for Mining RDF data. In:
International Workshop on Search and Mining Entity-Relationship Data (2011)
2. Agrawal, R., Srikant, R.: Fast Algorithms for mining association rules in large databases.
In: International Conference on Very Large Databases (1994)
3. El-Hajj, M., Zaiane, O.R.: COFI-tree Mining: A New Approach to Pattern Growth with Re-
duced Candidacy Generation. In: Workshop on Frequent Itemset Mining Implementations
(FIMI 2003) in conjunction with IEEE-International Conference on Data Mining (2003)
4. Bloehdorn, S., Sure, Y.: Kernel methods for mining instance data in ontologies. In: Aberer,
K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P.,
Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and
ISWC 2007. LNCS, vol. 4825, pp. 58–71. Springer, Heidelberg (2007)
5. Fanizzi, N., Amato, C., Esposito, F.: Metric-based stochastic conceptual clustering for on-
tologies. Information System 34(8), 792–806 (2009)
6. Amato, C., Bryl, V., Serafini, L.: Data-Driven logical reasoning. In: 8th International
Workshop on Uncertainty Reasoning for the Semantic Web (2012)
7. Nebot, R.B.V.: Finding association rules in semantic web data. Knowledge-based Sys-
tem 25(1), 51–62 (2012)
8. Agrawal, R., Swami, A.N.: Mining association rules between sets of items in large data-
bases. In: ACM SIGMOD International Conference on Management of Data (1993)
9. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM
SIGMOD International Conference on Management of Data (2000)
10. Bizer, T.H.C., Berners-Lee, T.: Linked Data - The Story so Far. International Journal on
Semantic Web and Information Systems (2009)
11. Ramezani, R., Saraee, M., Nematbakhsh, M.A.: Finding Association Rules in Linked Data
a centralized approach. In: 21st Iranian Conference on Electrical Engineering (ICEE)
(2013)
12. Narasimha, R.V., Vyas, O.P.: LiDDM: A Data Mining System for Linked Data. In: Work-
shop on Linked Data on the Web. CEUR Workshop Proceedings, vol. 813. Sun SITE Cen-
tral Europe (2011)
13. The Jena API, http://jena.apache.org/index
14. Potoniec, J., Ławrynowicz, A.: RMonto: Ontological extension to RapidMiner. In: Poster
and Demo Session of the ISWC 2011 - 10th International Semantic Web Conference,
Bonn, Germany (2011)
15. The Data Hub, http://thedatahub.org
16. The Association for Computing Machinery (ACM) Portal,
http://portal.acm.org/portal.cfm
17. The DBLP Computer Science Bibliography, http://dblp.uni-trier.de/
18. The Scientific Literature Digital Library and Search Engine,
http://citeseer.ist.psu.edu/
Automatic Synthesis of Notes Based on Carnatic Music
Raga Characteristics
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 145
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_17, © Springer International Publishing Switzerland 2014
146 V. Janani et al.
Shankarabharanam :
Arohana : S R2 G3 M1 P D2 N3 S
Avarohana : S N3 D2 P M1 G3 R2 S
Kalyani :
Arohana : S R2 G3 M2 P D2 N3 S
Avarohana: S N3 D2 P M2 G3 R2 S
2 Related Work
Hari V. Saharabuddhe [2] talks about generating automatic and computer assisted
notes that correspond to a particular Raga by constructing a Finite State automata for
the Raga, with each states in the Finite State Automata being the alaps (or) phrases of
Automatic Synthesis of Notes Based on Carnatic Music Raga Characteristics 147
the Raga. The paper claims that remembering the last two notes can produce
reasonable performance in a majority of ragas. The authors have further stated that
preserving information about elongation, grace notes, and/or successor selection
frequencies often enhances the quality of performance.
Dipanjan Das et al. [3] talks about generating the ‘Aarohanas’ and ‘Avarohanas’
present in each Raga that conforms to the Hindustani Classical Music for the purpose
of discovering and identifying new sequences. This idea implements a Finite State
Machine (FSM) which was proposed by H.V. Sahasrabuddhe [2] that generates a
sequence of Swaras that conforms to the rules of a particular Raga. Dipanjan Das et
al., designed a probabilistic finite state machine that generates only the Avarohana
and Aarohana of the said Raga. The FSM used in the paper can be looked upon as a
bigram model of three most frequent notes along with their probabilities that they
follow while moving in either direction so as to generate the ascending and
descending sequences of the particular Raga. The FSM for each Raga was constructed
manually as the idea deals with only three most frequently occurring notes. Once the
FSM created the ascending and descending notes, an algorithm is run on it to generate
one instance or sample of the said Raga. This algorithm takes the number of notes and
the start note of the Raga as input. Output of the entire process would be a sequence
conforming to a particular Raga which was generated upon constructing a sample of
the Aarohana and Avarohana of the same. The main drawback that we have identified
from this paper is that, a sequence generated using a randomized approach without
any defined rules to monitor the occurrence and position of the notes will result in a
sequence that does not conform to the characteristics of the Raga.
Subramanian in his paper [4], talks about generating computer music from skeletal
notations, along with the gamakkas that produce a smooth transition from one Swara
to another. The proposed system uses user defined gamakkam definition files for each
Raga. Addition of gamakkas depend on a) The Raga b) The context in which the note
occurs and c) The duration of the note. Gamakkams can be modeled in different ways.
One method would be to analyze a large number of recordings sung by music experts
and extract common features for each note of the Raga. This requires a system that
can identify note boundaries and transcribe live music into notation format.
one Swara follows another. By this way, characteristic phrases of the chosen Raga
(phrases that occur most frequently in the compositions belonging to that Raga) are
assigned maximum probability. The learning part involves the computation of the
probability with which the Swaras occur together and the frequency of the individual
Swaras.
Steps:
1. Input the training sequence and Raga
2. Learn the HMM model from the input training sequence and parameters
3. Sample from the trained HMM model
4. Generate output sequence of desired length
Automatic Synthesis of Notes Based on Carnatic Music Raga Characteristics 149
The aim of the learning phase is to find a model M̂ in the model space M* such
that
● The proposed system does not handle the gamakkas of the generated
composition.
● Since the system proposed is a statistical model, the quality of the generated
output is directly proportional to the amount of corpus data fed during the
learning phase of the system to fit the model.
The aim of the experiment was to measure the goodness of the notes generated by the
generative model and to validate whether the notes correspond to the Raga given as
input.
Six people trained in Carnatic music took part in the experiment. A dataset
consisting of songs in .wav format was prepared. The participants were asked to rate
each musical composition in the dataset on a scale of 1 to 5, based on their perceived
similarity to the corresponding Raga to which the composition originally belonged.
Table 1 shows the mean and standard deviation (in brackets) of scores on each Raga
for both the methods.
Automatic Synthesis of Notes Based on Carnatic Music Raga Characteristics 151
As seen from the above table, HMM performs better than the Bigram model.
HMM is able to learn complex and long-range associations in the composition. HMM
is also able to capture hierarchy in the compositions for reproducing characteristic
phrases in the Raga. The Bigram model is too simple and doesn’t allow for the
generation of complex melodies.
References
1. Sambamoorthy, P.: South Indian Music, vol. 4. Indian Music Publishing (1969)
2. Sahasrabuddhe, H.V.: Analysis and Synthesis of Hindustani Classical Music, University of
Poona (1992), http://www.it.iitb.ac.in/~hvs/paper_1992.html
3. Das, D., Choudhury, M.: Finite state models for generation of Hindustani classical music.
In: Proceedings of International Symposium on Frontiers of Research in Speech and Music
(2005)
4. Subramanian, M.: Generating Computer Music from Skeletal Notations for Carnatic Music
Compositions. In: Proceedings of the 2nd CompMusic Workshop (2012)
5. Hill, S.: Markov Melody Generator
152 V. Janani et al.
6. Steinsaltz, D., Wessel, D.: In Progress,The Markov Melody Engine: Generating Random
Melodies With Two-Step Markov Chains. Technical Report, Department of Statistics,
University of California at Berkeley
7. Kohlschein, C.: An introduction to hidden Markov models: Probability and Randomization
in Computer Science. Aachen University (2006-2007)
8. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel,
M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D.,
Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. The
Journal of Machine Learning Research 12, 2825–2830 (2011)
9. Oliphant, T.E.: Python for scientific computing. Computing in Science & Engineering 9(3),
10–20 (2007)
Smart Card Application for Attendance
Management System
1 Introduction
Smart cards are electronic cards which consists processor and memory. Hence can
store some data on memory and can also perform some computations on that data
through processor. Smart card can be either contactless or contact card depending
upon how they communicate with the outside world. Contact smart cards communi-
cate with the reader through a physical medium while contactless smart card does not
have any physical medium, it transfers data over radio waves through a RF interface.
In this paper, an application for attendance management is developed for managing
the records of attendance of group of people belonging to an organization. Group of
people can either be the students of a college or employees of an organization. For
attending attendance, user makes contact with the attendance reader and after finding
a valid card, user insert finger for biometric authentication. Apart from being used
for authentication and attendance these cards can also be used for multiple applica-
tions like library management (issue and return of books) and canteen access (E-cash
application).
In this paper, we will discuss all the entities, which have been taken for systematic
working of this application, in section 2. In section 3, set up of the system will be
described with smart card reader requirements and details, access points and other
setup details. Section 4 will give details of implementation of the system. And at the
last section 5 will conclude our work.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 153
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_18, © Springer International Publishing Switzerland 2014
154 S. Jain and A. Shukla
2 Related Work
According to [4], Smart card deployment is increasing pressure for vendor regarding-
security features and improvements in computing power to support encryptdecryptal-
gorithms with bigger footprints in the smart card chips in the lastdecade. Typical
applications described in [4] include subscriber identificationmodule cards (SIM
cards), micropayments, commuter cards, and identificationcards.
In [6], gives the new software design that provides the opportunity for developing
new multi-application smart cards. It also considers other techniques and software
design apart from the conventional cryptographic algorithms. A smart social contact
management system is described in [7], provides a better way of managing large
amount of social contacts through smart card.
The state of the art of unified read-write operations of the smart card with different
data formatsis given in [1]. Based on framework designing and system organizational
structures of “card type determining-> function calling ->Data Conversion- >unified
read-write”, a data reading, writing and reception data management of smart card of
different data formats through PC is developed in [1].System like Smart card is a
portable media which store sensible data. The information protection is possible with
personal identification number (PIN), orthrough finger-print or retina based biome-
trics.Algorithms and data structures are developed in [2] to solve security problem.
File system of Smart Card in [3] consists three types of files i.e. master
file,dedicated file, and elementary File. Master file and dedicated files are knownas
catalog files, while elementary files are known as data file. Master file is theroot file
of the cards. Each card consists only one MF, all the other sub-files of master file are
grandson of master file. All files (including master file) that consistssub-files are con-
sidered as dedicated files. If file node is a leaf node, and has nochild nodes, then file
is an elementary file. Elementary files are useful for thoseapplications that need to
keep data in the elementaryfile. Reference [3] classifiedaccording to data structure,
elementaryfile also includes a transparent binary file,the linear Fixed-length record
file, circular recording document and linear variablelength record files.
Leong PengChor and Tan Eng Chong describes in [8], a design of a controlaccess
method that allows access to an empty laboratory only to groups of authorizedstu-
dents of a size satisfying a prescribed threshold. The scheme employsa shared-secret
scheme, symmetric cryptography together with smart-card technology. The access
right of a student is carefully encoded and stored on a smartcard. Access records are
kept on these smart cards in a distributive fashion andwith duplication.
HoonKo and Ronnie D. Caytiles discuss in [9], Smartcard evolved from verysim-
ple phone cards to business cards made with inferior equipment into complexhigh
technology security solutions that can now support a large number of applications.
Smartcard usage is rigorously growing over the last decade as theyare being used in
telecommunications (GSM), banking services and various otherareas.
Smart Card Application for Attendance Management System 155
• Central Authority: This authority will generate the keys in the smart cards, so that
card issuer authority can use the cards.
• Card Issuer Authority: Card issuer authority will insert all the required related in-
formation in the card and issue that to the respective user.
• Application Creator Authority: This authority has privileged to create different
applications within a card. In this work, application creator authority will create an
application for attendance management purpose.
• Verification Authority: Verification authority can read only certain data after get-
ting permission. By read information this will verify it, can’t make changes in it.
• Card User: who has received a card from card issuer authority, is only permitted to
use that card.
These all authorities have different types of access tothe user card based on the
cryptographic keys. Theyneed to insert specific key and after successfulauthentication
and authorization they got privileges for accessing user card.
4 Set up
We have taken smart card from Infineon SLE 77 series, which consists 136 Kbytes
solid flash, 6 Kbytes RAM and 16 bit security controller. These cards support symme-
tric as well as asymmetric cryptography.
In our implementation, we have taken Congent Mini-Gate smart card reader shown
below in Fig. 1, which supports fingerprint authentication, can sense contactless smart
cards, provides pin combination as a access way and also have data network interface,
a keypad and multi color LCD Fig.2.
156 S. Jain and A. Shukla
Smart card reader should be smart enough to mark the attendance of the student
within a second when he/she makes contact to the Reader with the smart card. When
the attendance is marked it should display the name of the student, time of the contact
and produce a buzzer sound. The reader should have the fingerprint sensor to take the
fingerprint of student and authenticate the student and provide the access. It should be
connected with the LAN (or wireless), with the power and should be battery operated
in case of power failure. The Reader should be able to update the data with defined
frequency to the server. The reader should have sufficient memory to store the atten-
dance of the students when the network is down. Also it should be able to work in
offline mode also, i.e., when there is no connection with LAN, it should be able to
mark and store the attendance.
Access points where Reader needs to be installed are the classrooms, laboratories,
hostel and canteen. The Reader needs to be mounted at a suitable place at the access
points. One Reader needs to be installed per access point except those points where
students count is more.
Attendance Management system should have the capability of writ-
ing/Reading/printing of individual student data into the Card so that it can be issued to
the student by the approving authority.
5 Implementation
Working of entities of system is graphically represented in Fig.3. Central authority
has privileged to generate the keys in all the blank smart card. After generating keys it
Smart Card Application for Attendance Management System 157
will forward cards to card issuer authority. Card issuer authority will create files and
insert user specific information in that card. Then check that card is usable or not. If
not, then send it to application creator authority that will create necessary application
in card and return back to card issuer authority. If card issuer authority finds it usable
with particular user, thenissue it to that particular user. Now user can use this card
with some accessible attendance reader.
Central Authority
Generate keys
yes
If usable User
Issue card
Application Crea-
tor Authority . 3.
Fig.4 represents how connection takes place between smart card and attendance
reader. User makes contact of his card with the specified reader. After successful
mutual verification reader make connection with the card. Then it ask user for some
security checks like pin code or fingerprint verification. After successful security
checks user can perform any accessible operation on the card. If user sends some
verification request to the verification authority then verification authority will verify
user card. Verification authority can only verify if it submit its key successfully.
158 S. Jain and A. Shukla
Request for
verification
Verify
if success
User can perform any
Contacts
accessible operation
Fig. 4. Smart Card and Reader connectivity and role of Verification Authority
Card issuer authority will write the data on the smart card using the attendance man-
agement application and will issue thecard to the user. This application is providing
an easiest way to create different files on the blank smart card. This can also reset the
card, which makes card reusable. After create files on the card, can also write differ-
ent types of mentioned personal and public information. Then needed information is
printed on the card. Application creator authority can create number of application
directory files with in a card. A user can use smart card which is issued to him only
because of some security checks either in terms of pin or key or biometric authentica-
tion. Only card issuer authority can issue a duplicate card if card has been lost. If
anyone want to update of terminate his card than only card issuer authority has per-
mission to do this.
When user wants to make attendance, need to make contact with the attendance
reader. After successful mutual authentication reader ask for fingerprint minutia. Then
user needs to insert finger for verification purpose. If a wrong or unauthorized user is
using it then reader will not take attendance. It will make attendance only if it recog-
nizes a valid fingerprint minutia. Reader will show the date and time of attendance
after making it. When attendance of any user has got mark successfully, database on
the reader will updated for that user. After a specific time interval reader will send the
updated data through the Ethernet to the server (refer Fig.50).
Smart Card Application for Attendance Management System 159
6 Conclusion
References
1. Wang, R.-D., Wang, W.: Design and implementation of the general read-write system of the
smart card. In: 3rd IEEE International Conference on Ubi-Media Computing (2010)
2. Reillo, R.S.: Securing information and operations in a smart card through biometrics. In:
Proceedings of IEEE 34th Annual International Carnahan Conference on Security Technol-
ogy (2000)
3. Yuqiang, C., Xuanzi, H., Jianlan, G., Liang, L.: Design and implementation of Smart Card
COS. In: International Conference on Computer Application and System Modeling (2010)
4. Chandramouli, R., Lee, P.: Infrastructure Standards for Smart ID Card Deployment. IEEE
Security & Privacy 5(2), 92–96 (2007)
160 S. Jain and A. Shukla
5. Rankl, W., Effing, W.: Smart Card Handbook. John Wiley & Sons, Inc. (2002)
6. Selimis, G., Fournaris, A., Kostopoulos, G., Koufopavlou, O.: Software and Hardware Is-
sues in Smart Card Technology. IEEE Communications Surveys & Tutorials 11(3), 143–
152 (2009)
7. Guo, B., Zhang, D., Yang, D.: “Read” More from Business Cards: Toward a Smart Social
Contact Management System. In: IEEE/WIC/ACM International Conference on Web Intel-
ligence and Intelligent Agent Technology, pp. 384–387 (2011)
8. Chor, L.P., Chong, T.E.: Group accesses with smart card and threshold scheme. In: Pro-
ceedings of the IEEE Region 10 Conference, pp. 415–418 (1999)
9. Ko, H., Caytiles, R.D.: A Review of Smartcard Security Issues. Journal of Security Engi-
neering 8(3) (2011)
Performance Evaluation of GMM and SVM for
Recognition of Hierarchical Clustering Character
1 Introduction
Modeling handwriting and human like behaviour is becoming increasingly im-
portant in various fields. Handwritten character recognition has been one of
the most interesting and challenging research areas in field of image processing
and pattern recognition. It contributes enormously performance of an automa-
tion process and can improve the interaction between human and computers.
The scope of handwritten recognition can be extended to reading letters sent
to companies or public offices since there is a demand to sort, search, and au-
tomatically answer mails based on document content. Handwritten character
recognition is comparatively difficult, as different people have different writing
styles. So, handwritten character is still a subject of active research and has wide
range of application. Automatic recognition of handwritten information present
on documents like cheques, envelopes, forms and other manuscripts.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 161
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_19, c Springer International Publishing Switzerland 2014
162 V.C. Bharathi and M. Kalaiselvi Geetha
recognition accuracy. Vamvakas et al. [2] proposed new feature extraction based
on recursive subdivisions of the foreground pixels. SVM is used to classify the
handwritten character. Lei et al. [3] proposed directional string of statistical and
structure features implemented and classified by using a nearest neighbor match-
ing algorithm. Das et al. [4] and Pirlo and Impedovo [5] handwritten character
recognition based on static and dynamic zoning topologies designed the regular
grids that superimpose on the pattern images into regions of equal shape using
Voronoi based image zoning. [6] proposed quad tree longest run features taken in
vertical, horizontal and diagonal directions. SVM is used to classify and recognize
the handwritten numerals. Pauplin and Jiang [7] proposed two approaches to im-
prove handwritten character recognition. The first approach dynamic Bayesian
network models, observing raw pixel values optimizing the selection and layout
using evolutionary algorithms and second approach learning the structure of the
models. Reza and Khan et al. [8] developed a reliable grouping schema for the
similar looking characters using density of nodes is reduced by half from one
level to the next and classified the groups in SVM. Bhowmik et al. [9] proposed
daubechies wavelet transform with four coefficients for feature extraction, then
the features are grouped based on similar shapes. SVM classifier is used for
classify the groups.
2.1 Dataset
Handwritten dataset is created for uppercase characters. The input images were
saved in JPEG/PNG format for further preprocessing as shown in Fig. 2(a).
2.2 Pre-processing
The pre-processing is a set of operation performed on the input document [10].
The various task involved in the document images are
Performance Evaluation of GMM and SVM for Recognition 163
Binarization. Original RGB image is converted to grayscale and then the image
is complimented to obtain an image as shown in Fig. 2(b) for further processing.
All segmented character images are used for the feature extraction. In our
approach character intensity vector is used to extract the features from the
grayscale image based on blocks. The block under consideration of size 100×100.
This 100×100 region is divided into r×c subblocks each of size 5×5 number of
subblocks(n), where r and c=20. The value of r and c are fixed in such a way
that each subblocks are of equal size. Each subblock image of 20×20 pixels with
intensity range of 0 to 255 as shown in Fig. 3. The mean of the pixel intensity
of each subblock is computed as
r
c
count(px ,y )
x=1 y=1
bn = if (px,y > 127) (1)
r×c
where bn is number of subblocks n=1,..,25, px,y intensity of pixels, r number of
rows and c number of columns. So totally 25 subblocks and for each subblock
computes the mean of pixel intensity. There by 25 features extracted from each
block.
x
M
p = wi bi (2)
λ i=1
With mean vector μi and covariance matrix i . The mixture weight satisfy the
M
constraint that wi = 1. The complete Gaussian mixture model is parameter-
i=1
ized by the mean vectors, covariance matrices and mixture weight from all com-
ponent densities[12].
These parameters can collectively represented by the nota-
tion: λ = {wi , μi , i } for 1=1,2,..,M. The GMM parameters are estimated by the
Expecation-Maximization(EM) algorithm using training data by the handwritten
dataset. The basic idea of the EM algorithm is, beginning
with an initial language
model λ, to estimate a new model λ such that p X/λ ≥ p (X/λ ).
yi ∈ {+1, −1} are the class labels. The aim of SVM is to generate a model
which predicts the target value from testing set. In binary classification the hy-
per plane w.x + b = 0 where w ∈ Rn , b ∈ R is used to separate the two classes
in some space Z. The maximum margin is given by M = w 2
. The minimization
problem is solved by using Lagrange multipliers αi (i = 1, ...m) where w and b
are optimal values obtained from Eq. 4.
m
f (x) = sgn αi yi K(xi , x) + b (4)
i=1
The non-negative slack variables ξi are used to maximize margin and minimize
the training error. The soft margin classifier obtained by optimizing the Eq. 5
and Eq. 6.
1 T l
min w w+C ξi (5)
w,b,ξ 2
i=1
7 Experimental Results
The experimental are carried out in windows 8 operating system with Intel Xeon
X3430 Processor 2.40 GHz with 4 GB RAM. Proposed method is evaluated using
the handwritten dataset. Extracted CIV features are fed to GMM and LIBSVM
[16] tool to develop the model for each groups and these models are used to
recognize the characters.
The work used F-score as the combined measure of Precision (P) and recall (R)
for calculating accuracy which is defined as follows: F∝ = 2. PP+R
R
where ∝ is
weighting factor and ∝ = 1 is used.
7.3 Classifier
First Level Classifier. Initially 25 dimensional features are extracted using
character intensity vector. All the features are hierarchical grouped as shown in
Table. 1 and the clusters are trained and tested in GMM and SVM with RBF
kernel. GMM is to identified individual groups recognition accuracy and SVM
with RBF kernel with C=500, γ=0.1 to identified individual groups (horizontal,
vertical, no center and center) recognition accuracy are shown in Fig. 4.
Second Level Classifier. In the first level classifier construct four models
M1(horizontal), M2(vertical), M3(no center) and M4(center) using GMM and
SVM. In second level classifier, the testing samples tested in corresponding mod-
els to identify the individual character from the cluster. The recognition accuracy
of GMM and SVM with RBF kernel of individual characters as shown in Fig. 5.
8 Conclusion
References
1. Choudhary, A., Rishi, R., Ahlawat, S.: Offline Handwritten Character Recognition
using Features Extracted from Binarization Technique. In: AASRI Conference on
Intelligent Systems and Control, pp. 306–312 (2013)
2. Vamvakas, G., Gatos, B., Perantonis, S.J.: Handwritten Character Recognition
Through Two-Stage Foreground Sub-Sampling. Pattern Recognition 43, 2807–2816
(2010)
3. Lei, L., Li-liang, Z., Jing-fei, S.: Handwritten Character Recognition via Direction
String and Nearest Neighbour Matching. The Journal of China Universities of Posts
and Telecommunications 19(2), 160–165 (2012)
4. Pirlo, G., Impedovo, D.: Adaptive Membership Functions for Handwritten Char-
acter Recognition by Voronoi Based Image Zoning. IEEE Transactions on Image
Processing 21(9), 3827–3837 (2012)
5. Pirlo, G., Impedovo, D.: Fuzzy Zoning Based Classification for Handwritten Char-
acters. IEEE Transactions on Fuzzy Systems 19(4), 780–785 (2011)
6. Das, N., Reddy, J.M., Sarkar, R., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K.:
A Statistical Topological Feature Combination for Recognition of Handwritten
Numerals. Applied Soft Computing 12, 2486–2495 (2012)
7. Pauplin, O., Jiang, J.: DBN-Based Structural Learning and Optimisation for Auto-
mated Handwritten Character Recognition. Pattern Recognition Letters 33, 685–
692 (2012)
8. Reza, K.N., Khan, M.: Grouping of Handwritten Bangla Basic Characters, Numer-
als and Vowel Modifiers for Multilayer Classification. In: ICFHR 2012 Proceedings
of International Conference on Frontiers in Handwriting Recognition, pp. 325–330
(2012)
9. Bhowmik, T.K., Ghanty, P., Roy, A., Parui, S.K.: SVM based Hierarchical Archi-
tectures for Handwritten Bangala Character Recognition. International Journal on
Document Analysis and Recognition (IJDAR) 12, 97–108 (2009)
Performance Evaluation of GMM and SVM for Recognition 169
10. Bharathi, V.C., Geetha, M.K.: Segregated Handwritten Character Recognition us-
ing GLCM Features. International Journal of Computer Applications 84(2), 1–7
(2013)
11. Reynolds, D.A.: Gaussian Mixture Models. MIT Lincoln Laboratory, USA
12. Sarmah, K., Bhattacharjee, U.: GMM based Language Identification using MFCC
and SDC Features. International Journal of Computer Applications 85(5), 36–42
(2014)
13. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and
Other Kernel-based Learning Methods. Cambridge University Press (2000)
14. Mitchell, T.: Machine Learning. Hill Computer Science Series (1997)
15. Vapnik, V.: Statistical Learning Theory. Wiley, NY (1998)
16. Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM
Transactions on Intelligent Systems and Technology 2, 1–27 (2011)
Data Clustering and Zonationof Earthquake
Building Damage Hazard Area Using FKCN
and Kriging Algorithm
1 Introduction
USGS recorded that there are four large earthquake events in Indonesia which is Ban-
da earthquake (8.5 Mw) in 1983, Sumatra earthquake (9.1) in 2004,Nias Island earth-
quake (8.6) in 2005 and Sumatra West Coast earthquake (8.6) in 2012 [21]. The high
earthquake intensity has become the main tectonic characteristic of the archipelago of
Indonesia that located between three main plates, which is Eurasia at North, Indo-
Australia at South and the Pacific plate at Northeast. An earthquake with a
certain intensity and magnitude as a response to the movements of plates can result in
physical infrastructure damage and people casualties. The physical infrastructure
damage characterized as massive is caused by earthquake which is the damage
on buildings, caused by poor construction qualities (internal) and environment (exter-
nal) where the building is built. The recorded figure of building damages due to
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 171
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_20, © Springer International Publishing Switzerland 2014
172 E. Irwansyah and S. Hartati
earthquake happened at Banda Aceh city Indonesia in 2004 with a total of damaged
buildings of 35 percent compared to total existing building [9] and over 140,000
building units is damaged due to Yogyakarta Indonesia earthquake in 2006 [17].
Currently, research related to the building damage effectdue to earthquake has been
implemented using geographical information system (GIS) for hazard zoning of
earthquake prone areas [18], [20] and the evaluation of building damage due to earth-
quake using neuro-fuzzy method[3], [4],[5], [13], [16] and [19]. Neuro-fuzzy
algorithm that has been implemented in the researches are mostly using adaptive neu-
ro-fuzzy inference system (ANFIS) and it is not using spatial data and not used to
construct zones of hazard. An important value in research of zonation is to facilitate
the researcher to understand the important phenomenon in evaluating the building
damage due to earthquake in a spatial structure. Specifically, the research related to
data clusterization using FKCN algorithm has been proposed by [1],[2], [22] and
applied on several area by [12] and [15].
Based on the above facts, the research is to be performed with the objective to con-
structthe zonation of building damage hazard area due to earthquake usingFKCN
algorithm, in order to cluster the data and using krigingal gorithm for data interpola-
tion and construct the zonation..
3 Methodology
The data used in this research is composed of two types of data which is earthquake
data that contains data from peak ground acceleration (PGA), lithology and topo-
graphic zones and data from IRIS Plants database. The conversion process from li-
thology data and topographic zoning from map to become data with a class value
based on each contribution on each level of building damage. This research consists
of three steps which is (1) data normalization, (2) data clustering using FKCN algo-
rithm and (3) data interpolation using kriging algorithm and creation of zones (zona-
tion)
Data normalizes using min-max is a method that resulted in a linier transformation
data on the original data where min-max normalizationis mapping a value v from A to
v’ within a new minimal and maximal range[7]. The formula for min-max normaliza-
tion can be seen in the following equation.
= ( ) (1)
Where minA, maxA, new maxA and new minA is the minimum value, maximum
value from attribute A and maximum and minimum value on the new scale of
attribute A respectively.
Earthquake hazard data clustering is implemented using FKCN algorithm. The im-
plementation steps are as follows:
Step 1: Suppose a sample space X={x1,x2,…,xN}, xi∈R
f
Where N is the number of patterns and f is the pattern vector dimension; the dis-
tance is IIA, c as the number of clusters and error threshold > 0 some small posi-
tive constant.
Step 2: Initialization by determining the weight of initial vector V0 = (v10, v20,…vc0),
the number of initial cluster (m0) and tmax = iteration limit.
Step 3: For t=1,2,…,tmax
a) Calculate all learning rate
mt = m0 – t *∆m, ∆m = (m0 – 1)/tmax (2)
2 1
II , 1 II 1
, = (∑(II II
) ) (3)
, 1
αik,t= (uik,t)
mt
(4)
c) Calculate:
174 E. Irwansyah and S. Hartati
The next step is data interpolation using kriging algorithm and plotting the average
value of the result of clustering on each grid for zonation. Kriging is one method of
the prediction and interpolation in geo-statistics. Interpolation analysis is required
because it is impossible to take data from all the existing location. The interpolation
technique takes data from some location and resulted in a predicted value for the other
locations.
A sample data from location 1,2…,n is V(x1), V(x2), … , V(xn), therefore to predict
V(x0) is [6]:
n
Vˆ ( x0 ) = wiV ( xi ) (7)
i =1
Codification and cluster result that has been validated IRIS Plant database, and further
implemented for building damage hazard that consists of data obtained from peak
ground acceleration (PGA), lithology and topographic zone data.
Data Clustering and Zonationof Earthquake Building Damage Hazard Area 175
Clusterization process divides the research data into 3 (three) classes with different
characteristics. Using cluster center data that was further obtained the membership of
each data can be determined based on the distance of each data and the cluster center
for each class. The first class is a class with average PGA value 0.87813 (medium)
with lithology domination that has a high compaction and fine grained.
The topographic zone on the first class is mainly from the inner earth with an aver-
age PGA value 0.87779 (low) which is dominated by lithology that has not been
compacted therefore the low compaction, and located in low topographic zone close
to the coast and river side. The third class is a class with an average PGA 0.87935
(high) which is dominated by lithology that has not been compacted on topographic
zone swamp which is very close to the coast. The data characteristic on each class
resulted from the clusterization using FKCN algorithm can be seen in Table 2.
Fig. 1. Buildingdaamage hazard zonation in Banda Aceh (red: high building damage hazard
area; orange: mediumbuilding damage hazard area;yellow: low building damage hazard area)
of building damage. The second zone is the zone with the characteristics in between
those two zones. The zonation produced in this research is showing spatial patterns
where the coastal area is divided into two zones which are a high building damage
hazard zone and a medium building damage caused by earthquake. Low building
damage zone is located relatively toward land. (Fig.1).
5 Conclusion
The FKCN algorithm which is implemented in this research using the same parameter
successfully increased the correct number and correct rate and at the same time pro-
ducing cluster center that was produced in previous researches.
Clusterization produces three data classes of building damage hazard which is the
first class with an average PGA 0.87813 (medium) with a dominant lithology that has
been compacted until high compaction on topographic zones in the Inland Area. The
second class with an average PGA 0.87779 (low) with dominant lithology that has
been compacted until low compaction and it is in the low topographic zone near the
coast and river sided, and the third zone with an average PGA 0.87935 (high) which is
dominated by lithology that has not been compacted in swamp topographic zone
which is close to coast area.
Banda Aceh City and the surrounding area is divided into three zones (zonation)
which is high hazard zone, medium hazard zone caused by earthquake originating
from coastal area and low building damage hazard zone which is located relatively
further away towards the inland.
References
1. Almeida, C.W.D., Souza, R.M.C.R., Ana Lúcia, B.: IFKCN: Applying Fuzzy Kohonen
Clustering Network to Interval Data. In: WCCI 2012 IEEE World Congress on Computa-
tional Intelligence, pp. 1–6 (2012)
2. Bezdek, J.C., Tsao, E.C.-K., Pal, N.R.: Fuzzy Kohonen Clustering Networks. Fuzzy Sys-
tems 27(5), 757–764 (1992)
3. Carreño, M.L., Cardona, O.D., Barbat, A.H.: Computational Tool for Post-Earthquake
Evaluation of Damage in Buildings. J. Earthquake Spectra 26(1), 63–86 (2010)
4. Elenas, A., Vrochidou, E., Alvanitopoulos, P., Ioannis Andreadis, L.: Classification of
Seismic Damages in Buildings Using Fuzzy Logic Procedures. In: Papadrakakis, et al.
(eds.) Computational Methods in Stochastic Dynamic. Computational Methods in Applied
Sciences, vol. 26, pp. 335–344. Springer, Heidelberg (2013)
5. Fallahian, S., Seyedpoor, S.M.: A Two Stage Method for Structural Damage Identification
Using An Adaptive Neuro-Fuzzy Inference System and Particle Swarm Optimization.
Asian J. of Civil Engineering (Building and Housing) 11(6), 795–808 (2010)
6. Fisher, M.M., Getis, A. (eds.): Handbook of Applied Spatial Analysis –Software Tools,
Methods and Applications. Springer, Heidelberg (2010)
7. Han, J., Kamber, M., Pei, J.: Data Mining Concept and Techniques, 3rd edn. Morgan
Kaufmann-Elsevier, Amsterdam (2012)
178 E. Irwansyah and S. Hartati
8. Hathaway, R.J., Bezdek, J.C.: Nerf C-Means Non-Euclidean Relation Fuzzy Clustering.
Pattern Recognition 27(3), 429–437 (1994)
9. Irwansyah, E.: Building Damage Assessment Using Remote Sensing, Aerial Photograph
and GIS Data-Case Study in Banda Aceh after Sumatera Earthquake 2004. In: Proceeding
of Seminar on Intelligent Technology and Its Application (SITIA 2010), vol. 11(1), pp.
57–65 (2010)
10. Irwansyah, E., Hartati, S.: Zonasi Daerah BahayaKerusakanBangunanAkibatGempaMeng-
gunakanAlgoritma SOM Dan AlgoritmaKriging. In: Proceeding of Seminar Nasional Tek-
nologiInformasi (SNATI 2012), vol. 9(1), pp. 26–33 (2012) (in Bahasa)
11. Irwansyah, E., Winarko, E., Rasjid, Z.E., Bekti, R.D.: Earthquake Hazard Zonation Using
Peak Ground Acceleration (PGA) Approach. Journal of Physics: Conference Se-
ries 423(1), 1–9 (2013)
12. Jabbar, N., Ahson, S.I., Mehrotra, M.: Fuzzy Kohonen Clustering Network for Color Im-
age Segmentation. In: 2009 International Conference on Machine Learning and Compu-
ting, Australia, vol. 3, pp. 254–257 (2011)
13. Jiang, S.F., Zhang, C.M., Zhang, S.: Two-Stage Structural Damage Detection Using Fuzzy
Neural Networks and Data Fusion Techniques. Expert Systems with Application 38(1),
511–519 (2011)
14. Kohonen, T.: New Developments and Applications of Self-Organizing Map. In: Proceed-
ing of the 1996 International Workshop on Neural Networks for Identification, Control,
Robotics, and Signal/Image Processing (NICROSP 1996) (1996)
15. Lind, C.T., George Lee, C.S.: Neural Fuzzy System: A Neuro-Fuzzy Synergism to Intelli-
gent System. Prentice-Hall, London (1996)
16. Mittal, A., Sharma, S., Kanungo, D.P.: A Comparison of ANFIS and ANN for the Predic-
tion of Peak Ground Acceleration in Indian Himalayan Region. In: Deep, K., Nagar, A.,
Pant, M., Bansal, J.C. (eds.) Proceedings of the International Conf. on SocProS 2011.
AISC, vol. 131, pp. 485–495. Springer, Heidelberg (2012)
17. Miura, H., Wijeyewickrema, A.C., Inoue, S.: Evaluation of Tsunami Damage in the East-
ern Part of Sri Lanka Due To the 2004 Sumatra Earthquake Using High-Resolution Satel-
lite Images. In: Proceedings of 3rd International Workshop on Remote Sensing for Post-
Disaster Response, pp. 12–13 (2005)
18. Ponnusamy, J.: GIS based Earthquake Risk-Vulnerability Analysis and Post-quake Relief.
In: Proceedings of 13th Annual International Conference and Exhibition on Geospatial In-
formation Technology and Application (MapIndia), Gurgaon, India (2010)
19. Sanchez-Silva, M., Garcia, L.: Earthquake Damage Assessment Based on Fuzzy Logic and
Neural Networks. Earthquake Spectra 17(1), 89–112 (2001)
20. Slob, S., Hack, R., Scarpas, T., van Bemmelen, B., Duque, A.: A Methodology for Seismic
Microzonation Using GIS And SHAKE—A Case Study From Armenia, Colombia. In:
Proceedings of 9th Congress of the International Association for Engineering Geology and
the Environment: Engineering Geology for Developing Countries, pp. 2843–2852 (2002)
21. United States Geological Survey-USGS, http://earthquake.usgs.gov/
22. Yang, Y., Jia, Z., Chang, C., Qin, X., Li, T., Wang, H., Zhao, J.: An Efficient Fuzzy Ko-
honen Clustering Network Algorithm. In: Proceedings of 5th International Conference
on Fuzzy Systems and Knowledge Discovery, pp. 510–513. IEEE Press (2008)
Parametric Representation of Paragraphs
and Their Classification
1 Introduction
Automatic text classification is a very challenging computational problem to
the information retrieval and digital publishing communities, both in academic
as well as in industry. Currently, text classification problem has generated lot
of interests among researchers. There are several industrial problems that lead
to text classification or identification. One such problem that digital publish-
ing companies encounter is paragraph classification. A document have multiple
paragraphs of various sections such as title, abstract, introduction, conclusion,
acknowledgement, bibliography and many more. When a document is published
online or offline, different visualization style is applied on paragraphs based on
the section, it belongs to, and its contextual information.
In the era of digitalization, along with the printing format every document is
published in the web for easier access to the user over internet. All the paragraphs
in documents are not formatted with same style because of better visualization
and appearance. Moreover, the style applied on digital documents and on doc-
uments meant for printing are different. Sometimes, the style of a paragraph of
the same document, published by different publishers, is different. As a business
strategy, online publishers display some parts of the documents depending on
types of user. Based on the document, online publishers decide which parts of
the documents or which paragraphs are to be displayed for a user. The iden-
tification of the paragraphs will help to automatically apply the style once the
paragraphs are correctly identified. The automatic classification of paragraphs
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 179
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_21, c Springer International Publishing Switzerland 2014
180 D. Bhandari and P.S. Ghosh
will reduce the manual interaction and effort. Along with the several business
benefits it will also help to build up the repository of references.
At present, the identification of paragraphs is carried out using rule based
heuristic algorithms. The heuristics are developed based on individuals’ expe-
rience and some trivial rules. These systems are document specific and require
significant amount of human interactions in their implementation.
There are very few articles, available in literature that focuses on the para-
graph classification. In [1], Crossley et. al. have tried to classify the paragraphs
taking into consideration the linguistic features such as textual cohesion and
lexical sophistication. They correlate these features and proposed a model us-
ing statistical techniques. They have also tried to demonstrate the importance
of the position of the paragraph in a document while classifying a paragraph.
Sporleder & Lapta have used BoosTexter as machine learning tool to identify
the paragraphs automatically in different languages and domains based on lan-
guage modeling features, syntactic features and positional features [2]. They
also investigated the relation between paragraph boundaries and discourse cues,
pronominalization and information structure on German data for paragraph seg-
mentation. Filippova et. al. proposed a machine-learning approach to identify
paragraph boundary utilizing linguistically motivated features [3]. Taboada et.
al. distinguished different type of paragraphs based on lexicon and semantic
analysis within the paragraph and used naive bayes classifier, SVM and linear
regression as machine learning tool [4].
The primary challenge in designing an automatic paragraph classifier is to
represent the paragraphs in such a way so that computer can easily process
them. At the same time, the representation should maintain the identity of the
paragraphs. These motivate us to propose a new framework that represents a
paragraph using its characteristics keeping in mind the objective of classification.
In this article, an attempt has been made to represent a paragraph using its
properties or features. Once the paragraphs are represented or characterized by
some parameters, one can always use a tool to design a supervised classifier. One
such tool being used here is multi layer perceptron (MLP) as it is a sophisticated
modeling technique used in different application areas viz., aerospace, medical
science, image processing, natural language processing and many more [5,6,7,8].
A brief discussion on paragraph classification is presented in Section 2. Sec-
tion 3 discusses the proposed representation technique of a paragraph using its
properties and MLP based classifier. Experiments and computer simulation re-
sults are presented in Section 4. The final section contains concluding remarks
and a short discussion on future scope of work.
2 Classification of Paragraph
3 Parameterizations of a Paragraph
In the rule based classification technique, various rules are generated based on
human experience and the paragraphs are categorized in several classes. Gener-
ally, the rules are perceived by human beings depending on several characteristics
of a paragraph. There are some obvious protocol that is maintained in arranging
the paragraphs in a document. The authors take into consideration the readers’
mind to arrange the paragraphs in preparing a document. People gain the expe-
rience of presenting a document by reading numerous documents and analyzing
the style of presentation. In general, the paragraphs are classified depending
on its content. However, the position of the paragraph also plays an important
role in deciding its class. Moreover, different publishing houses follow different
processes in arranging the paragraphs.
The objective of this work is to develop a methodology that would automat-
ically classify paragraphs emulating the process that human beings adapt by
reading numerous documents and analyzing readers’ perception. In this context,
the primary challenge is to present a paragraph to a machine. More specifically,
paragraphs have to be fed into the machine so that machine can learn the pat-
terns of the paragraphs in order to identify an unknown paragraph into a specific
class. In this work, the paragraphs are represented using number of properties or
features that would be utilized to categorize the paragraphs in different classes
correctly. One can extract three different types of features of a paragraph,viz.,
Positional, Visual and Syntactic properties
Positional properties: The position of a paragraph provides significant in-
formation that would be helpful in identifying its class in the document. One can
define number of positional properties of a paragraph depending on its position
in the document. Some of the properties, used in this work are paraCount(total
paragraph present in a document) and paraPos(position of the current para-
graph).
182 D. Bhandari and P.S. Ghosh
Visual properties: Visual properties signifies the look and feel of the para-
graph of a document. They are the textual information of a para and are defined
based on font, color, alignment, indentation, spacing and many more. This prop-
erties are very effective in discriminating a paragraph. Some of the features, in
this regard, are: font element such as font-size, bold, italic, underline; formatting
element like indentation, spaces.
Syntactic properties: Syntactic properties are the contextual information
of a paragraph. A simple set of properties are extracted from the string con-
tent, using simple text analysis techniques. Some of the syntactic properties are
phraseCount (number of total phrases present into the paragraph), sentence-
Count (number of sentences present into the paragraph), articleCount (number
of articles present into the paragraph).
Once the parameters are defined to characterize a paragraph, designing the
MLP based classifier is straight forward. The basic steps of the MLP-based
methodology are:
In this process, the properties or features are identified that characterize a para-
graph in order to build the required classifier. It requires sufficient knowledge
about the current process of paragraph identification. There are numerous prop-
erties or features can be found of a paragraph as discussed earlier. The pri-
mary task here is to identify the properties that would explicitly characterize
a paragraph. Initially, 50 properties were identified. In our experiment, only 27
properties are used to design the classifier. Few such parameters, with detail
description, are provided in Table 1.
The feature values obtained using XSLT are of numeric, boolean or text types.
For easier processing and to provide numeric inputs to the artificial neural net-
work (ANN), the values are quantified. The boolean values are mapped to 0
and 1. To quantify the text properties, we used some of the predefined look
up tables. As an example, an XSLT element style:text-underline-width, the
possible outcomes are auto, normal, bold, thin, dash, medium, or thick. In such
case, one can assign 1 for ’auto’, 2 for ’normal’, 3 for ’bold’ and so on.
(identically and independently distributed) with zero mean and unity standard
deviation.
dj − μ (dj )
dj = , ∀j (1)
σ (dj )
Often in large data sets, there exist samples (called outliers) that do not comply
with the general behavior of the data. In our paragraph identification problem,
outliers can be caused by multiple heterogeneous sources of articles received from
numerous authors and due to the error introduced during quantification. Out-
liers affect the learning process of the MLP and thereby reduces the classification
performance in terms of accuracy and convergence. It is customary, in any clas-
sification algorithm, to identify the outliers and eliminate them from the data
set. However, it is also to be noted that the number of eliminated data points
should be as minimum as possible. In this work, the z-score values are used to
identify the outliers. The standardized samples (dt ) into a compact range, nor-
mally between [α, -α] (where α is a small quantity) could minimize the effect
from random noise. In our experiment, it has been observed that considering α
value as 3, less than 3% of total data points are identified as outliers.
Back propagation technique is used here to train the MLP. Training is continued
until the Mean Square Error (MSE) is less than some predefined small quantity
() (in our experiment, is assumed as 0.001). The initial weights are selected
randomly. The learning rate (η) is considered to be varying from 0.1 to 0.001
(decreasing with iteration number) and the moment is set to 0.8.
4 Experimental Result
The experiment is carried out using Java on Microsoft platform (Windows 98)
and 2GB CPU RAM. In our experiment, 27 input features are considered for
the paragraphs belonging to 9 classes which are widely used and very difficult
to classify. The features were extracted from about 20,000 paragraphs of 1000
documents taken from various domains. The paragraphs belong to 30 different
classes including the trivial classes. 20% of randomly selected data points were
used for training the network and rest of the data points were considered for
testing. To validate the effectiveness of the methodology, the experiment was
performed 5 times by randomly selecting different training data sets. It has been
observed that each run produces consistently similar results. Table 3 shows the
performance of the rule based approach and the proposed framework using MLP.
The results clearly shows that the proposed parameter based approach is
superior than the conventional rule based approach. The proposed approach
performs consistently well for all types of paragraphs. The rule based approach
Parametric Representation of Paragraphs and Their Classification 185
provides better classification for classes like articleTitle, author, affiliation and
correspondingAuthor, due to their distinct visual and syntactic features. The
performance of the proposed approach may be improved by defining some ap-
propriate properties or features that would enhance the discrimination ability of
the MLP. The performance of the parameter based approach with and without
outliers is also been provided in Table 3. As expected, that removal of outliers
results in improved performance.
References
1. Crossley, S.A., Dempsey, K., McNamara, D.S.: Classifying paragraph types using
linguistic features: Is paragraph positioning important? Journal of Writing Re-
search 3(2), 119–143 (2011)
2. Sporleder, C.: Automatic paragraph identification: A study across languages and
domains. In: Proceedings of the Conference on Empirical Methods in Natural Lan-
guage Processing, pp. 72–79 (2004)
3. Filippova, K., Strube, M.: Using linguistically motivated features for paragraph
boundary identification. In: EMNLP, pp. 267–274 (2006)
4. Taboada, M., Brooke, J., Stede, M.: Genre-based paragraph classification for senti-
ment analysis. In: Proceedings of SIGDIAL 2009, pp. 62–70 (2009)
5. Lieske, S.P., Thoby-Brisson, M., Telgkamp, P., Ramirez, J.M.: Reconfiguration of
the neural network controlling multiple breathing patterns: eupnea, sighs and gasps.
Nature Neuroscience 3, 600–607 (2000)
6. Collobert, R., Wetson, J.: Fast semantic extraction using a novel neural network
architecture. In: Proceedings of the 45th Annual Meeting of the Association of Com-
putational Linguistics, pp. 560–567 (2007)
7. Sebastiani, F.: Machine learning in automated text categorization. ACM Computing
Surveys 34, 1–47 (2002)
8. Pomerleau, D.A.: Neural network simulation at warp speed: how we got 17 million
connections per second. IEEE 2, 143–150 (1988)
LDA Based Emotion Recognition from Lyrics
Abstract. Music is one way to express emotion. Music can be felt/heard either
using an instrument or as a song which is a combination of instrument and
lyrics. Emotion Recognition in a song can be done either using musical features
or lyrical features. But at times musical features may be misinterpreting, when
the music dominates the lyrics. So a system is proposed to recognize emotion of
the song using Latent Dirichlet Allocation (LDA) modelling technique. LDA is
a probabilistic, statistical approach to document modelling that discovers latent
semantic topics in large corpus. Since there is a chance of more than one
emotion occurring in a song, LDA is used to determine the probability of each
emotion in a given song. The sequences of N-gram words along with their
probabilities are used as features to construct the LDA. The system is evaluated
by conducting a manual survey and found to be 72% accurate.
1 Introduction
The recognition of emotion has become in a multi-disciplinary research area that has
received great interest. It plays an important role in the improvement of Music
information retrieval (MIR), content based searching, human machine interaction,
emotion detection and other music related application. The word emotion covers a
wide range of behaviours, feelings, and changes in the mental state [1].
Emotion/mood is a feeling that is private and subjective. There are eight basic
emotions such as happy, sadness, acceptance, disgust, anger, fear, surprise and
anticipation [2]. The recognition of emotion/mood in a song can be done either using
the lyrical features or musical features. But, sometimes the musical features like tone
may mislead the emotion of the song. Hence, in this work, the emotion of the song is
identified by considering only the lyrics. In this paper, Section 2 focuses on literature
survey that discusses on the various proposed methods to classify emotion. Section 3
on system design, Section 4 focuses on the implementation details along with the
parameters set and used with the justification details and finally Section 5 concludes
the paper and future extension for this current work.
2 Literature Survey
Hu, Xiao, Mert Bay, and J. Stephen Downie [3] used social tags on last.fm to derive a
set of three primitive mood categories. Social tags of single adjective words on a
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 187
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_22, © Springer International Publishing Switzerland 2014
188 K. Dakshina and R. Sridhar
publicly available audio dataset, USPOP, were collected and 19 mood related terms
of the highest popularity were selected manually which was later reduced to three
latent mood categories using multi-dimensional scaling. This finally scaled mood set
was not used by others because three categories were seen as a domain
oversimplification.
Yang and Lee [4] performed early work on supplementing audio mood
classification with lyric text analysis. A lyric bag-of-words (BOW) approach was
combined with 182 psychological features proposed in the General Inquirer to
disambiguate categories which was found confusing by the audio-based classifiers
and the overall classification accuracy was improved by 2.1%. However, no reliable
conclusions were drawn as the dataset was too small.
Dang, Trung-Thanh, and Kiyoaki Shirai [5] constructed a system that
automatically classifies the mood of the songs based on lyrics and metadata, and
many methods were employed to train the classifier by the means of supervised
learning. The classifier was trained using the data gathered from a Live Journal blog
where song and mood were tagged appropriately for each entry of the blog. The
classifier was trained with the help of various machine learning algorithms like
Support Vector Machines, Naive Bayes and Graph-based methods. The accuracy of
mood classification methods was not good enough to apply for a real music search
engine system. There are two main reasons: mood is a subjective metadata; lyric is
short and contains many metaphors which only human can understand. Hence, the
authors plan in future to integrate audio information and musical features with lyrical
features for improving the accuracy and efficiency of the mood identification system.
Chen, Xu, Zhang and Luo [6] applied the machine learning approach to classify
song sentiment using lyric text, in which the VSM is adopted as text representation
method. The authors performed error analysis and have identified that the term based
VSM is ineffective in representing song lyric. Hence the authors concluded that a new
lyric representation model is crucial for song sentiment analysis.
Han, Byeong-jun, Seungmin Ho, Roger B. Dannenberg, and Eenjun Hwang [7]
designed music emotion recognition using Support Vector Regression. Several
machine learning classification algorithms such as SVM, SVR and GMM are
considered for evaluating the system that recognizes the music automatically. There
was a tremendous increase in accuracy using SVR when compared to GMM. Many
other features can be considered with other classification algorithms such as fuzzy
and kNN (k-Nearest Neighbor).
Yang, Dan, and Won-Sook Lee [8] proposed the novel approach of identifying
discrete music emotions from lyrics. Statistical text processing tools were applied to
the novel application area of identifying emotion in music using lyrics. A
psychological model of emotion was developed that covers 23 specific emotions. A
feature vector of 182 psychological features was constructed from the song lyrics
using a content analysis package. Classification decision trees were used to
understand the classification models. Results for each specific emotion were
promising.
Latent Dirichlet allocation (LDA) model has been proposed by David Blei [9].
This is an unsupervised model for topic modelling. LDA is a probabilistic, statistical
LDA Based Emotion Recognition from Lyrics 189
3 System Design
In this section, the system architecture and the steps involved in the design process are
considered.
The block diagram of the proposed work for recognizing the emotion is given in Fig.
1. The emotion recognition process is carried out by initially collecting the Tamil
lyric texts and pre-processing it. The frequency and probability for the pre-processed
Tamil lyric document is calculated and N-gram model is constructed. The constructed
N-gram model is used as features in LDA. The designed LDA model is tested with a
new Tamil lyric document.
Data Collection and Preprocessing. Collecting data is the first part of the training
phase of emotion recognition process. Sufficient data must be collected in both the
190 K. Dakshina and R. Sridhar
training phase and testing phase for effective measurements. The accuracy of the
recognition process largely depends on the training samples. The system must be
trained efficiently with large number of training samples. A large number of lyric
samples are collected for different emotions of the songs. A subset of the collected
samples is used for training. For testing purpose, any sample can be used. The pre-
processing and handling of the Tamil lyrics is initially performed to remove stop
words for recognising emotion. Emotion of the song is very sensitive, so unnecessary
words which do not contribute to the emotion recognition should be removed during
the pre-processing of the lyrics and the POS tagger is used for collecting relevant
parts of speech and avoids word disambiguation problems arising due to polysemy.
LDA Construction and Emotion Recognition. LDA strongly asserts that each word
has some semantic information, and documents having similar topics will use same
set of words. Latent topics are thus identified by finding groups of words in the
corpus that frequently occur together within documents. In this way, LDA depicts
each document as a random mixture containing several latent topics, where each and
every topic has its own specific distribution of words [14]. In this paper, techniques
and results are shown and it is suggested that LDA aids not only in the text domain,
but also in lyric domain.
Initially a training set is considered and supervised learning is given for the
training set. The LDA dirichlet parameters are derived from the training set and the
same is used for the test set based on which the mood/emotion has to be determined.
Distribution Over Emotion and Features. Fig. 2 shows the influence of the dirichlet
parameters of LDA over emotion recognition. If a feature appears in a emotion, it is
indicated that the lyric document in which the feature is present, contains the emotion.
The LDA model is constructed using the alpha (α) and beta (β) parameters derived from the
training set. The calculation of alpha and beta are given below. Alpha (α) is the dirichlet prior
on the per-document topic distributions. It is given by
The N gram features are used in identifying the emotion of the Tamil song lyrics. The
probability of these features are calculated and appended in a document. The probability
found is used in the calculation of the LDA parameters stated in section 4.1.
4.3 Evaluation
Evaluation of the system carried out with the help of 5 randomly selected volunteer to
validate the data in the database. In manual evaluation volunteers were asked to read
the lyrics of the songs in a peaceful room and recognize emotion manually. Each
volunteer was asked to find emotion of as many songs as possible. Between each
songs, the listener had as much time as he/she wanted to rate the emotion.
Extracting information on emotions from song is difficult for many reasons,
because both music and lyric is a subjective quantity and also related to culture. So,
the feel of emotion to a particular song may differ from person to person. While
evaluation, same song evoked different emotion to different volunteer.
The parameter Emotion Recognition rate M, is used for evaluation of the proposed
system. It is given by
50 Correctly
recognised
0
Mood Recognition
identifies the mood/emotion for a given song. The different possible emotions for a
particular song with probability are given. Since a probability model is used, not
always the emotion with correct probability is recognized by the system. Hence, our
system identified the emotions correctly for more than 100 songs, upon a total of 160
songs and is given in Fig. 3 and Fig. 4.
A system where the emotion of the song is identified based on the lyrics of the Tamil
song is being proposed and implemented. The lyrical features considered in this work
is the probability of one word occurring after another. The emotion is identified by
constructing an LDA model using the probability of word sequence as features. The
use of LDA technique improves the accuracy of recognition process greatly and
identifies the various emotions that are present in a song with their respective
probability. However, although the proposed system recognizes emotion of a song
with good level of accuracy, consideration of more features will improve the accuracy
of the emotion recognition system.
In future, emotion can be used as features for designing a song Search system. This
search system could be designed to retrieve similar songs that match the input lyric’s
emotion.
References
1. http://en.wikiversity.org/wiki/Motivation_and_emotion/
Book/2013/Pet_ownership_and_emotion
2. http://www.translate.com/english/lyrics-in-singular-
form-lyric-are-a-set-of-words-that-make-up-a-song-usually-
consisting-of-verses/
194 K. Dakshina and R. Sridhar
3. Xiao, H., Bay, M., Downie, J.S.: Creating a simplified music mood classification
groundtruth set. In: Proceedings of the 8th International Conference on Music Information
Retrieval, pp. 309–310 (2007)
4. Yang, D., Lee, W.: Disambiguating music emotion using software agents. In: Proceedings
of the 5th International Conference on Music Information Retrieval, pp. 218–223 (2004)
5. Trung-Thanh, D., Shirai, K.: Machine learning approaches for mood classification of
songs toward music search engine. In: International Conference on Knowledge and
Systems Engineering, pp. 144–149 (2009)
6. Chen, R.H., Xu, Z.L., Zhang, Z.X., Luo, F.Z.: Content based music emotion analysis and
recognition. In: Proceedings of 2006 International Workshop on Computer Music and
Audio Technology, pp. 68–75 (2006)
7. Byeong-Jun, H., Ho, S., Dannenberg, R.B., Hwang, E.: SMERS: Music emotion
recognition using support vector regression. In: Proceedings of the 10th International
Society for Music Information Conference, pp. 651–656 (2009)
8. Yang, D., Lee, W.-S.: Music emotion identification from lyrics. In: 11th IEEE
International Symposium In Multimedia, pp. 624–629 (2009)
9. Blei, D.M., Ng, A.Y., Jordon, M.I.: Latent Dirichlet Allocation. Journal of Machine
Learning Research 3, 993–1022 (2003)
10. Qingqiang, W., Zhang, C., Deng, X., Jiang, C.: LDA-based model for topic evolution
mining on text. In: 6th International Conference in Computer Science & Education, pp.
946–949 (2011)
11. Wu, M., Dong, Z.C., Weiyao, L., Qiang, W.Q.: Text topic mining based on LDA and co-
occurrence theory. In: 7th International Conference on Computer Science & Education, pp.
525–528 (2012)
12. Sridhar, R., Geetha, T.V., Subramanian, M., Lavanya, B.M., Malinidevi, B.: Latent
Dirichlet Allocation Model for raga identification of carnatic music. Journal of Computer
Science, 1711–1716 (2011)
13. Arulheethayadharthani, S., Sridhar, R.: Latent Dirichlet Allocation Model for Recognizing
Emotion from Music. In: Kumar M., A., R., S., Kumar, T.V.S. (eds.) Proceedings of
ICAdC. AISC, vol. 174, pp. 475–481. Springer, Heidelberg (2013)
14. Hu, D., Saul, L.: Latent Dirichlet Allocation for text, images, and music. University of
California (2009)
Efficient Approach for Near Duplicate Document
Detection Using Textual and Conceptual
Based Techniques
Abstract. With the rapid development and usage of World Wide Web, there are
a huge number of duplicate web pages. To help the search engine for providing
results free from duplicates, detection and elimination of duplicates is required.
The proposed approach combines the strength of some "state of the art"
duplicate detection algorithms like Shingling and Simhash to efficiently detect
and eliminate near duplicate web pages while considering some important
factors like word order. In addition, it employs Latent Semantic Indexing (LSI)
to detect conceptually similar documents which are often not detected by
textual based duplicate detection techniques like Shingling and Simhash. The
approach utilizes hamming distance and cosine similarity (for textual and
conceptual duplicate detection respectively) between two documents as their
similarity measure. For performance measurement, the F-measure of the
proposed approach is compared with the traditional Simhash technique.
Experimental results show that our approach can outperform the traditional
Simhash.
1 Introduction
The near duplicates not only appear in web search but also in other contexts, such as
news articles. The presence of near duplicates has a negative impact on both
efficiency and effectiveness of search engines. Efficiency is adversely affected
because they increase the space needed to store indexes, ultimately slowing down the
access time. Effectiveness is hindered due to the retrieval of redundant documents.
For designing robust and efficient information retrieval system, it is necessary to
identify and eliminate duplicates.
Two documents are said to be near duplicates if they are highly similar to each
other [3]. Here the notion of syntactic similarity and semantic similarity between two
documents has to be carefully considered. A set of syntactically similar documents
may not necessarily give positive result when tested for semantic similarity and vice
versa. So two independent strategies have to be employed to detect and eliminate
syntactically and semantically similar documents. Most of the traditional duplicate
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 195
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_23, © Springer International Publishing Switzerland 2014
196 R.K. Roul, S. Mittal, and P. Joshi
detection algorithms [1],[2] have considered only the aspect of syntactic similarity.
Here, apart from performing textual near duplicate detection, an approach to detect
and eliminate semantically (conceptually) similar documents has been proposed.
Another aspect of duplicate detection algorithms is the precision-recall trade off.
Often an algorithm based on mere presence or absence of tokens in documents to be
compared performs low on precision but yields a high recall considering only
textually similar documents. For example if shingling [1] is implemented with one as
the shingle size, it yields high recall but low precision. On the other hand, when
techniques based on co-occurrence of words in document are used, taking into
account even the order of words, its precision increases by a reasonable amount but its
recall decreases. Thus, it is essential to give a right amount of importance to co-
occurrence of words in documents but at the same time taking care of the recall. In the
proposed approach, F-measure [9] has been used as a performance measure to give
optimum precision-recall combination.
The outline of the paper is as follows: Section 2reviews previous research work on
detection of duplicate documents.Section 3 proposesthe approach to effectively detect
near duplicate documents. Section 4demonstrates the experimental results which are
followed by conclusion and future work covered in Section 5.
2 Related Work
One of the major difficulties for developing approach to detect near duplicate
documents was the representation of documents. Broder [1] proposed an elegant
solution to this by representing a document as a set of Shingles. The notion of
similarity between two documents as the ratio of number of unique Shingles common
to both the documents to the total number of unique Shingles in both the documents
was defined as resemblance. The approach was brute force in nature and thus not
practical. Charikar [2] proposed another approach which required very low storage as
compared to the Shingling. The documentswere represented as a short string of
binaries (usually 64 bits) which is called fingerprint.Documents are then compared
using this fingerprint which is calculated using hash values. Both of the above
approaches became the "state of the art” approaches for detection of near duplicate
documents. Henzinger [3] found that none of these approaches worked well for
detecting near duplicates on the same site. Thus, they proposed a combined approach
which had a high precision and recall and also overcame the shortcomings of
Shingling and Simhash. Bingfeng Pi et al. [6] worked on impact of short size on
duplicate detection and devised an approach using Simhash to detect near duplicates
in a corpus of over 500 thousand short messages. Sun, Qin et al. [5] proposed a model
for near duplicate detection which took query time into consideration. They proposed
a document signature selection algorithm and claimed that it outperformed
Winnowing – which is one of the widely used signature selection algorithm.Zhang et
al. [7] proposed a novel approach to detect near duplicate web pages which was to
work with a web crawler. It used semi-structured contents of the web pages to crawl
and detect near duplicates. Figuerola et al. [8] suggested the use of fuzzy hashing to
generate the fingerprints of the web documents. These fingerprints were then used to
estimate the similarity between two documents. Manku et al. [4] demonstrated
practicality of using Simhash to identify near duplicates in a corpus containing multi-
billion documents. They also showed that fingerprints with length 64 bits are
Efficient Approach for Near Duplicate Document Detection 197
appropriate to detect near duplicates. Martin Theobald et al. [10] took the idea of
Shingling forward to include the stopwords to form short chains of adjacent content
terms to create robust document signatures of the web pages.
In this paper, Shingling along with Simhash is used to detect textually similar
documents. LSI [13] has also been used for detecting conceptually similar documents.
Gensim [11], a Python toolkit has been used for experimental purpose and, it has been
found out that the F-measure of the proposed approach is better than the traditional
Simhash technique.
3 Proposed Approach
Input: A preprocessed set of documents <D1, D2, D3,…,Dn>, where n is the total
number of documents in the corpus.
Output: A set of documents <D1, D2, D3,…, Dm> where, m<=n, which are free of
near duplicates.
The proposed approach is described in Fig. 1.
1. Stop words are removed from the documents by using preProcess()method.
2. TF-IDF [12] is calculated for each token of each document using getTFIDF()
method and stored in Tfidf array. TF-IDF weights are normalized by the length of the
document so that length of the document does not have any impact on the process.
3. Each document is then represented by a set of shingles of size three(experimentally
determined). These shingles are stored in Shingles array.
4. Each shingle hashed value is stored in HashValues array. Hashing the shingle itself
(and not the token) takes into account effect of word order and not mere presence or
absence of words as in case of traditional Simhash. Thus two documents, with same
set of words used in different contexts, will not be considered near duplicates unless
their word order matches to a large extent. If only single words were selected as
features, two documents with largely same words, in different contexts and with
different meanings are considered duplicates.
5. A 'textual fingerprint' (64 bit) textFing is generated for every document.
6. 'Conceptual fingerprints' lsiFing are computed for each document in the corpus.
This is done using SVD (Singular Value Decomposition) matrices.
7. The list of textual fingerprints is sorted so that similar fingerprints (fingerprints
which differ from each other in very less number of bits) come closer to each other.
8. The thresholds used for filtering the near duplicates are as follows:
i. th1–This is a threshold that must be exceeded by textual similarity of a pair
of documents to be considered for further detection. Here the idea that a pair
of conceptually similar documents will also invariably exhibit some amount
of textual similarity as well is used by keeping the threshold reasonably low.
th2-This is the textual similarity threshold. If the textual similarity of the pair
of documents exceeds th2, they are considered as textual near duplicates.
This is set to a quite higher value as compared to th1 based on experimental
results.
198 R.K. Roul, S. Mittal, and P. Joshi
else
textFing[i][j]-=Tfidf[i][j]
for i in range(0, n)//converting to binary
for j in range(0, 64)
iftextFing[i][j] > 0
textFing[i][j] = 1
else
textFing[i][j] = 0
4. for each di in <D1, D2,…,Dn>
termDocMat = getTermDocMat(di)
U, s, V = getSVD(termDocMat)
lsiFing = s*(transpose(V))//lsi fingerprint
5. for i in range(0, 64)
//rotate all textual fingerprints by 1 bit
rotate(textFing)
sort(textFing)
for j in range(0, n-1)
textSim = 64 - HamDist(textFing[j],textFing[j+1])
concSim = cosine(jth doc, j+1th doc)
if(textSim>th1)
if(textSim<th2)
//cosine similarity calculated using lsiFing
if(concSim>th3)
//add edge to Conceptual Graph
grConc.add(jth doc, j+1th doc)
elif(textSim>th2 &&textSim<th3)
grText.add(jth doc, j+1th doc)
else
grExact.add(jth doc, j+1th doc)
6. removeDups(grText, grConc, grExact)
Fig. 1. The proposed approach
Efficient Approach for Near Duplicate Document Detection 199
ii. th3 - This threshold is used to detect exact duplicates. Thus it is set to a very
high value.
iii. th4 – This is the conceptual similarity threshold. If the cosine similarity [9]
of two documents being checked for conceptual similarity exceeds th4, they
are considered conceptual near duplicates.
9. Each pair of adjacent documents in the sorted list is checked for near duplicates as
follows:
i. If the documents under consideration are textual near duplicates ((64 -
hamming distance) exceeds th2), or exact duplicates ((64 - hamming
distance) exceeds th3), they are added to graph of Textual and Exact
duplicates respectively.
ii. If none of the conditions mentioned above are satisfied and if the (64 -
hamming distance) of documents under consideration exceed th1, the
documents are checked for conceptual similarity. If their cosine similarity
exceeds a threshold, th4, they are added to the graph of conceptual
duplicates.
10. Step (7) and (8) are repeated 64 times, each time rotating each fingerprint by
one bit so as to cover all the possible near duplicate pairs. Here, the approach takes
advantage of the property of the fingerprints that similar documents differ by very low
number of bits in their respective fingerprints. Thus, every document need not be
checked with every other document in the corpus for textual similarity.
11. Select a document from each set of duplicate documents (unique document,
kept in the corpus) and remove others from input list. Remove Dups method returns
the set of documents which are free of near duplicates.From each set of near duplicate
documents, it keeps the longest document with a view to lose minimum information.
Here, other techniques can be applied like computing each document's relevance to
the query and outputting the one with the highest value.
4 Experimental Results
reasonable value. As th1 is the minimum similarity threshold which two documents
must pass to be considered for duplicate detection, it can be set to an intuitively low
value. Similar argument holds true forth3 and th4. So, only the Shingle size and th2
need to be determined experimentally. Fig. 2 described the precision, recall and F-
measure with respect to Shingle size (where threshold is set to optimum). As the
graph clearly shows, the Shingle size = 3 gives the best results with highest F-measure
of 0.88. When Shingle size is very low, the algorithm tends to neglect the effect of
word order while testing two documents for duplicates. On the other hand, when
Shingle size is very high, algorithm can only find duplicates which are exact copies of
each other and lacks ability to find near duplicates. A balance between these two
extremes is obtained using a sequence of tokens of length 3 each as suggested by the
results. Fig. 3 shows the graphs of precision, recall and F-measure with respect the
value of th2. Here, it can be seen that as the value of th2 increases, precision increases
and recall decreases. When th2 value is increased, two documents are considered to
be near duplicates only when they have a large chunk of text in common and thus this
does not cover all near duplicate documents, leading to low recall. Whereas, when th2
value is very low, almost every document is considered as a near duplicate covering
all the actual near duplicate documents but performing poorly on precision. F-
measure is used to determine the optimum th2 value which turns out to be 46.
1.2
0.8 F-measure
Values
0.6 Precision
0.4 Recall
0.2
0
1 2 3 4
Shingle Size
1.2
0.8
Values
F-measure
0.6
Precision
0.4
Recall
0.2
0
40 42 44 46 48 50
Threshold
The results are shown inn Fig. 4. Simhash could only identify 7 documents as nnear
duplicates whereas the proposed
p approach could identify 40 near dupliccate
documents. Simhash perforrms excellently on precision but poorly for recall. Thiis is
because, Simhash, when comparing two documents, totally relies on syntaactic
occurrences of words in thee documents and does not take into account the contexxt of
occurrence of words. The use of Shingling along with Simhash and LSI, in the
proposed approach, takes into account not only the syntactic occurrence of words in
the documents, but also thee context or the co-occurrence of terms. This leads to not
only high precision but high
h recall as well, because both – syntactic and semantic nnear
duplicates are identified.
40
No of near duplicates
35 Total duplicates
30 claimed
25
True Positive
20
15
10 False Positive
5
0
P
Proposed
Approach
A Simhash
Fig. 4. Comparison of Proposeed Approach and Simhash based on the No. of duplicates deteccted
In this paper, the proposedd approach uses a combination of Shingling and Simhhash
techniques to detect textuall near duplicates. For detecting conceptual near duplicaates,
LSI is used to generate fingerprints
fi of the documents. Optimal Shingle size and
202 R.K. Roul, S. Mittal, and P. Joshi
References
1. Broder, A.Z.: Identifying and filtering near-duplicate documents. In: Giancarlo, R.,
Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 1–10. Springer, Heidelberg (2000)
2. Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: STOC
2002: Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pp.
380–388. ACM, New York (2002)
3. Henzinger, M.: Finding near-duplicate web pages: a large-scale evaluation of algorithms.
In: SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval, pp. 284–291. ACM, New York
(2006)
4. Manku, G.S., Jain, A., Sharma, A.D.: Detecting Near-duplicates for web crawling. In:
WWW / Track: Data Mining (2007)
5. Sun, Y., Qin, J., Wang, W.: Near Duplicate Text Detection Using Frequency-Biased
Signatures. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013,
Part I. LNCS, vol. 8180, pp. 277–291. Springer, Heidelberg (2013)
6. Pi, B., Fu, S., Wang, W., Han, S.: SimHash-based Effective and Efficient Detecting of
Near-Duplicate Short Messages. In: Proceedings of the 2nd Symposium International
Computer Science and Computational Technology
7. Zhang, Y.H., Zhang, F.: Research on New Algorithm of Topic-Oriented Crawler and
Duplicated Web Pages Detection. In: Intelligent Computing Theories and Applications 8th
International Conference, ICIC, Huangshan, China, pp. 25–29 (2012)
8. Figuerola, C.G., Díaz, R.G., Berrocal, J.L.A., Rodríguez, A.F.Z.: Web Document
Duplicate Detection using Fuzzy Hashing. In: Trends in Practical Applications of Agents
and Multiagent Systems, 9th International Conference on Practical Applications of Agents
and Multiagent Systems, vol. 90, pp. 117–125 (2011)
9. Tan, P.N., Kumar, V., Steinbach, M.: Introduction to Data Mining. Pearson
10. Theobald, M., Siddharth, J., Paepcke, A.: SpotSigs: Robust and Efficient Near Duplicate
Detection. In: Large Web Collections in (SIGIR 2008), pp. 20–24 (2008)
11. Rehurek, R., Sojka, P.: Software Framework for Topic Modeling with Large Corpora. In:
Proceedings of LREC workshop New Challenges for NLP Frameworks, pp. 46–50.
University of Malta, Valleta (2010)
Efficient Approach for Near Duplicate Document Detection 203
12. Robertson, S.: Understanding Inverse Document Frequency: On theoretical arguments for
IDF. Journal of Documentation 60(5), 503–520
13. Golub, G.H., Reinsch, C.: Singular value decomposition and least square solutions.
Numerische Mathematik 10. IV 5(14), 403–420 (1970)
14. Celikik, M., Bast, H.: Fast error-tolerant search on very large texts. In: SAC 2009
Proceedings of the ACM Symposium on Applied Computing, pp. 1724–1731 (2009)
Decision Tree Techniques Applied on NSL-KDD
Data and Its Comparison with Various Feature
Selection Techniques
Abstract. Intrusion detection system (IDS) is one of the important research area
in field of information and network security to protect information or data from
unauthorized access. IDS is a classifier that can classify the data as normal or
attack. In this paper, we have focused on many existing feature selection tech-
niques to remove irrelevant features from NSL-KDD data set to develop a ro-
bust classifier that will be computationally efficient and effective. Four different
feature selection techniques :Info Gain, Correlation, Relief and Symmetrical
Uncertainty are combined with C4.5 decision tree technique to develop IDS .
Experimental works are carried out using WEKA open source data mining tool-
and obtained results show that C4.5 with Info Gain feature selection technique
has produced highest accuracy of 99.68% with 17 features, however result ob-
tain in case of Symmetrical Uncertainty with C4.5 is also promising with
99.64% accuracy in case of only 11 features . Results are better as compare to
the work already done in this area.
1 Introduction
The ever increasing size of data in computer has made information security more
important. Information security means protecting information and information sys-
tems from unauthorized access. Information security becomes more important as data
are being accessed in a network environment and transferred over an insecure me-
dium. Many authors have worked on this issue and applied feature selection technique
on NSL-KDD data set for multiclass problem. Mukharjee, S. et al.[9] have proposed
new feature reduction method: Feature Validity Based Reduction Method (FVBRM)
applied on one of the efficient classifier Naive Bayes and achieved 97.78%accuracy
on reduced NSL-KDD data set with 24 features. This technique gives better perfor-
mance as compare to Case Based Feature Selection (CFS), Gain Ratio (GR) and Info
Gain Ratio (IGR) to design IDS. Panda, M.et al.[10] have suggested hybrid technique
with combination of random forest, dichotomies and ensemble of balanced nested
dichotomies (END) model and achieved detection rate 99.50% and low false alarm
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 205
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_24, © Springer International Publishing Switzerland 2014
206 H.S. Hota and A.K. Shrivas
rate 0.1% which are quite encouraging in comparison to all other models. Imran, H.
M.et al.[11] have proposed hybrid technique of Linear Discriminant Analysis
(LDA) algorithm and Genetic Algorithm (GA) for feature selection . They
proposed features selection technique applied on radial basis function with NSL-KDD
data set to develop a robust IDS. The different feature subsets are applied on RBF
model which produces highest accuracy of 99.3% in case of 11 features. Bhavsar, Y.
B.et al. [12] have discussed different support vector machine (SVM) kernel function
as Gaussian radial basis function (RBF) kernel, polynomial kernel, sigmoid kernel to
develop IDS. They have compared accuracy and computation time of different kernel
function as classifier, authors suggested Gaussian RBF kernel function as the best
kernel function ,which achieved highest accuracy of 98.57% with 10- fold cross vali-
dation. Amira, S.A.S.et al. [8] have proposed Minkowski distance technique based on
genetic algorithm to develop IDS to detect anomalies. The proposed Minkowski dis-
tance techniques applied on NSL-KDD data produces higher detection rate of 82.13%
in case of higher threshold value and smaller population. They have also compared
their results with Euclidean distance. In a recent work by Hota, H. S. et al.[15] , a
binary class based IDS using random forest technique combined with rank based
feature selection was investigated. Accuracy achieved in case of above technique
was 99.76% with 15 features.
It is observed from the literature that feature selection is an important issue due to
high dimensionality feature space of IDS data. This research work explores many
existing feature selection techniques to be applied on NSL-KDD data set to reduce
irrelevant features from data set, so that IDS will become computationally fast and
efficient. Experimental work is carried out using WEKA (Waikato Environment for
Knowledge Analysis)[13] and Tanagra [14] open source data mining tool and ob-
tained results revealed that C4.5 with Info Gain feature selection technique and C4.5
with Symmetrical Uncertainty feature selection technique produces highest accuracy
respectively with 17 and 11 features and is highest among the entire research outcome
reviewed so far.
Among the above decision tree techniques, C4.5 is more powerful and produces
better results for many classification problems. C4.5 [1] is an extension of ID3 that
accounts for unavailable values, continuous attribute value ranges, pruning of deci-
sion trees and rule derivation. In building a decision tree, we can deal with training set
that have records with unknown attributes values by evaluating the gain, or the gain
ratio, for an attribute values are available. We can classify the records that have un-
known attribute value by estimating the probability of the various possible results.
Unlike CART, which generates a binary decision tree, C4.5 produces tree with varia-
ble branches per node. When a discrete variable is chosen as the splitting attribute in
C4.5, there will be one branch for each value of the attribute.
Table 1. Different attacks and Normal data along with sample size
Normal 13449
DoS 9234
R2L 209
U2R 11
Probe 2289
Total 25192
From the above table it is clear that data set is highly unbalanced or there is no uni-
form distribution of samples. Number of samples of DoS type attack is high (9234
samples) while on the other hand U2R type of attack has only 11 samples. This unba-
lanced distribution of samples may create problem during training of any data mining
based classification model.
In order to verify efficiency and accuracy of IDS model, data set is divided into
various partitions based on training and testing samples.
3 Experimental Work
The experimental work is carried out using WEKA [13] and TANAGRA [14] open
source data mining software. This entire experimental work is divided into two parts:
First building multiclass classifier and second applying feature selection technique.
Multiclass classifier for IDS is developed based on various decision tree techniques
and finally a best model (C4.5) with reduced feature subset is selected.
As explained NSL-KDD data set is divided into three different partitions as 60-
40%, 80-20%, and 90-10% as training–testing samples. Decision tree based models
are first trained and then tested on various partitions of data set. Accuracy of different
models with different partitions is shown in Table 2.Accuracy of model is varying
from one partition to other partition, highest accuracy of 99.56% was achieved in case
of C4.5 with 90-10% partition with all available features.
Decision Tree Tech
hniques Applied on NSL-KDD Data and Its Comparison 209
Both the models are competitive and can be accepted for the development of IDS.
Fig. 1 shows pictorial view of accuracy obtained in case of different feature selection
techniques combined with C4.5 model.
4 Conclusion
Information security is a crucial issue due to exchanging of huge amount of data and
information in day to day life .Intrusion detection system (IDS) is a way to secure our
host computer as well as network computer from intruder. Efficient and computation-
ally fast IDS can be developed using decision tree based techniques. On the other
hand irrelevant features available in the data set must be reduced. In this research
work ,an attempt has been made to explore various decision tree techniques with
many existing feature selection techniques. Four different feature selection techniques
are tested on CART, ID3, REP, Decision table and C4.5 .Two rank based feature
selection techniques: Info Gain and Symmetrical Uncertainty along with C4.5 pro-
duces 99.68% and 99.64% accuracy with 17 and 11 features respectively. Results
obtained in these two cases are quite satisfactory as compare to other research work
already done in this field.
In future a feature selection technique based on statistical measures will be devel-
oped and will be tested on more than one benchmark data related to intrusion; also
proposed feature selection technique will be compared with all other existing feature
selection techniques.
Table 4. Selected features in case of C4.5 model using various feature selection techniques
Feature Selection No. of Accuracy Selected Features
Technique fea-
tures
Info Gain-C4.5 17 99.68 {5,3,6,4,30,29,33,34,35,38,12,39,25,23,
26,37,32}
Correlation-C4.5 37 99.56 {29,39,38,25,26,33,34,4,12,23,32,3,35,4
0,27,41,28,36,31,37,2,30,8,1,22,19,10,2
4,14,15,6,17,16,13,18,11,5}
ReliefF-C4.5 37 99.56 {3,29,4,36,32,38,12,33,2,34,39,23,26,35
,40,30,31,8,24,25,37,27,41,28,10,22,1,6,
14,11,13,15,19,18,16,5,17}
Symmetrical Un- 11 99.64 {4,30,5,25,26,39,38,6,29,12,3}
certainty-C4.5
References
1. Pujari, A.K.: Data mining techniques, 4th edn. Universities Press (India), Private Limited
(2001)
2. Tang, Z.H., MacLennan, J.: Data mining with SQL Server 2005. Willey Publishing, Inc.,
USA (2005)
3. Web sources, http://www.iscx.info/NSL-KDD/ (last accessed on October 2013)
4. Han, J., Kamber, M.: Data Mining Concepts and Techniques, 2nd edn. Morgan Kaufmann,
San Francisco (2006)
Decision Tree Techniques Applied on NSL-KDD Data and Its Comparison 211
5. Krzysztopf, J.C., Pedrycz, W., Roman, W.S.: Data mining methods for knowledge discov-
ery, 3rd edn. Kluwer Academic Publishers (2000)
6. Zewdie, M.: Optimal feature selection for Network Intrusion Detection System: A data
mining approach. Thesis, Master of Science in Information Science, Addis Ababa Univer-
sity, Ethiopia (2011)
7. Parimala, R., Nallaswamy, R.: A study of a spam E-mail classification using feature selec-
tion package. Global Journals Inc. (USA) 11, 44–54 (2011)
8. Aziz, A.S.A., Salama, M.A., Hassanien, A., Hanafi, S.E.-O.: Artificial Immune System In-
spired Intrusion Detection System Using Genetic Algorithm. Informatica 36, 347–357
(2012)
9. Mukherjee, S., Sharma, N.: Intrusion detection using Bayes classifier with feature reduc-
tion. Procedia Technology 4, 119–128 (2012)
10. Panda, M., Abrahamet, A., Patra, M.R.: A hybrid intelligent approach for network intru-
sion detection. Proceedia Engineering 30, 1–9 (2012)
11. Imran, H.M., Abdullah, A.B., Hussain, M., Palaniappan, S., Ahmad, I.: Intrusion Detection
based on Optimum Features Subset and Efficient Dataset Selection. International Journal
of Engineering and Innovative Technology (IJEIT) 2, 265–270 (2012)
12. Bhavsar, Y.B., Waghmare, K.C.: Intrusion Detection System Using Data Mining Tech-
nique: Support Vector Machine. International Journal of Emerging Technology and Ad-
vanced Engineering 3, 581–586 (2013)
13. Web sources, http://www.cs.waikato.ac.nz/~ml/weka/ (last accessed on
October 2013)
14. Web sources, http://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra
(last accessed on October 2013)
15. Hota, H.S., Shrivas, A.K.: Data Mining Approach for Developing Various Models Based
on Types of Attack and Feature Selection as Intrusion Detection Systems (IDS). In: Intel-
ligent Computing, Networking, and Informatics. AISC, vol. 243, pp. 845–851. Springer,
Heidelberg (2014)
Matra and Tempo Detection for INDIC Tala-s
1 Introduction
tala is the term used in Indian classical music for the rhythmic pattern of any
composition and for the entire subject of rhythm, roughly corresponds to metre
in Western music. They are rhythmic cycles that group long measures (anga-s).
In theory, there are 360 tala-s which range from 3 to 108 matra-s, although only
30 to 40 are in use today [1].
Musical metadata, or information about the music, is of growing importance
in a rapidly expanding world of digital music analysis and consumption. To
make the desired music easily accessible to the consumer, it is important to have
meaningful and robust descriptions of music that are amenable to search.
An efficient music classification system can serve as the foundation for vari-
ous applications like music indexing, content based music information retrieval,
music content description, and music genre classification. With Electronic Music
Distribution, as the music catalogues are becoming huge, it is becoming almost
impossible to manually check each of them for choosing the desired genre song.
In order to label the data set, one crude option is labelling them manually which
is of huge cost. Thus, automatic genre discrimination is becoming more and more
popular area of research. In this context, Music Information Retrieval for INDIC
music based on matra,tala and Tempo, can act as an elementary step for content
based music retrieval system and organizing the digital library for INDIC music.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 213
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_25, c Springer International Publishing Switzerland 2014
214 S. Bhaduri, S.K. Saha, and C. Mazumdar
In this work, we have dealt with the signals of tablaa, an Indian drum instru-
ment. matra and Tempo are automatically detected from the signal to enable
organized archiving of such audio data and fast retrieval based on those fun-
damental rhythmic parameters. The rest of the paper is organized as follows.
Definition and details of matra and tala are presented in Section 2. A survey of
past work is placed in Section 3. Section 4 elaborates the proposed methodol-
ogy. Experimental results are presented in Section 5. The paper is concluded in
Section 6.
tali + - - - 2 - - - 0 - - - 3 - - -
matra 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
bol dha dhin dhin dha dha dhin dhin dha na tin tin na tete dhin dhin dha
anga 1 - - - 2 - - - 3 - - - 4 - - -
tali + - - - 0 - - -
matra 1 2 3 4 5 6 7 8
bol dha ge na ti na ka dhi na
anga 1 - - - 2 - - -
tali + - - 0 - -
matra 1 2 3 4 5 6
bol dha dhi na na ti na
anga 1 - - 2 - -
3 Past Work
Previous work in this context includes the areas like Recognition of tablaa bol -s
and tablaa strokes as well as tablaa acoustics.
Tonal Quality of tablaa is first discussed by C. V. Raman [2]. The importance
of the first three to five harmonics which are derived from the drumhead’s vi-
bration mode was highlighted. Bhat [3] developed a mathematical model of the
membrane’s vibration modes that could well be applied to the tablaa.
Malu and Siddharthan [4] confirmed C.V. Raman’s observations on the har-
monic properties of Indian drums, and the tablaa in particular. They attributed
the presence of harmonic overtones to the ”central loading”(black patch) in the
center of the dayan (the gaab). Chatwani [5] developed a computer program
based on linear predictive coding (LPC) analysis to recognize spoken bol -s. An
acoustic and perceptual comparison of tablaa bol -s (both, spoken and played)
was done by Patel et al. [6]. It was found that spoken bol -s do indeed have sig-
nificant correlations in terms of acoustical features (e.g. spectral centroid, flux)
with played bol -s. It enables even an untrained listeners to match syllables to
the corresponding drum sound. This provides strong support for the symbolic
value of tablaa bol -s in the North Indian drumming tradition.
Gillet et al. [7] worked on tablaa stroke recognition. Their approach follows
three steps: stroke segmentation, computation of relative durations (using beat
detection techniques), and stroke recognition. Transcription is performed with a
tali + - 2 - - 0 - 3 - -
matra 1 2 3 4 5 6 7 8 9 10
bol dhi na dhi dhi na ti na dhi dhi na
anga 1 - 2 - - 3 - 4 - -
216 S. Bhaduri, S.K. Saha, and C. Mazumdar
Hidden Markov Model (HMM). The advantage of their model is its ability to take
into account the context during the transcription phase. A bol recognizer with
cepstrum based features and and HMM has been presented in [8]. Essl et al. [9]
applied the theory of banded wave-guides to highly inharmonic vibrating struc-
tures. Software implementations of simple percussion instruments like the musical
saw, glasses and bowls along-with the model for tablaa were presented. The work
of Gillet et al. [7] was extended in the work of Chordia [10] by incorporating dif-
ferent classifiers like neural network, decision trees, multivariate Gaussian model.
A system that segments and recognizes tablaa strokes has been implemented.
Conventional works in rhythm detection relies on frame (audio segment of
small duration) based autocorrelation techniques which is relatively costly. In
this work, we present a simple scheme that relies on perceptual features like the
amplitude envelope of recorded clips of electronic tablaa. This proposed system
detects the number of matra-s and Tempo (in Beats per Minute i.e. BPM) of
a tala. Detection is made based on the distribution of amplitude peaks and
matching of repetitive pattern present in the clip with the standard patterns for
tala-s of INDIC music.
4 Proposed Methodology
tabla is a two-part instrument. The left part is called bayan and the right part
dayan. In tablaa there is a central loading (black patch) at the center of the
dayan. When played it gives rise to various harmonic overtones and that forms
the characteristics of a tablaa signal. Depending on the bol -s used in the theka
of a tala, number of harmonics and their amplitude envelope varies. This results
into different repetitive pattern for different tala-s. Based on this understanding,
the proposed methodology revolves around the detection of onset pattern in the
signal. Thus, onset detection is the fundamental step. As suggested in [11], [12], it
follows the principle that an onset detection algorithm should follow the human
auditory system by treating frequency bands separately and finally combining
the results. The major steps to identify the matra and Tempo as follows.
– Extraction of peaks from amplitude envelope
• Decompose the signal into different frequency bands.
• Create the envelope for each band and sum them up.
• Extract the amplitude peaks
– Detect matra and Tempo.
MIRtoolbox [13] is used to create the amplitude envelope from the audio clip
and subsequent extraction of peaks. As the peaks correspond to the beats, by
analysing the inherent patterns in the peaks, matra-s and Tempos are detected.
Audio waveform
1
0.8
0.6
0.4
0.2
amplitude
−0.2
−0.4
−0.6
−0.8
−1
0 5 10 15 20 25
time (s)
(a)
Audio waveform Differentiated envelope
20 x 10
−3 20
50 5 10 15 20 25 0.020 5 10 15 20 25
19 0 19 0.01
−5 0
0 5 10 15 20 25 0 5 10 15 20 25
18 18
0.050 5 10 15 20 25 0.20 5 10 15 20 25
17 0 17 0.1
−0.05 0
0 5 10 15 20 25 0 5 10 15 20 25
16 16
0.10 5 10 15 20 25 10 5 10 15 20 25
15 0 15 0.5
−0.1 0
0 5 10 15 20 25 0 5 10 15 20 25
14 14
0.10 5 10 15 20 25 20 5 10 15 20 25
13 0 13 1
−0.1 0
0 5 10 15 20 25 0 5 10 15 20 25
12 12
0.20 5 10 15 20 25 20 5 10 15 20 25
11 0 11 1
−0.2 0
0 5 10 15 20 25 0 5 10 15 20 25
10 10
0.20 5 10 15 20 25 40 5 10 15 20 25
9 0 9 2
−0.2 0
0 5 10 15 20 25 0 5 10 15 20 25
8 8
0.50 5 10 15 20 25 100 5 10 15 20 25
7 0 7 5
−0.5 0
0 5 10 15 20 25 0 5 10 15 20 25
6 6
0.20 5 10 15 20 25 20 5 10 15 20 25
5 0 5 1
−0.2 0
0 5 10 15 20 25 0 5 10 15 20 25
4 4
0.20 5 10 15 20 25 40 5 10 15 20 25
3 0 3 2
−0.2 0
0 5 10 15 20 25 0 5 10 15 20 25
2 2
0.10 5 10 15 20 25 20 5 10 15 20 25
1 0 1 1
−0.1 0
0 5 10 15 20 25 0 5 10 15 20 25
time (s) time (s)
(b) (c)
Differentiated envelope
20
18
16
14 Differentiated envelope
20
12 18
amplitude
16
10
14
8 12
amplitude
10
6
4
6
2 4
2
0
0 5 10 15 20 25
time (s) 0
0 5 10 15 20 25
time (s)
(d) (e)
Fig. 1. Peak extraction from amplitude envelope: (a) Time varying signal of tablaa, (b)
Decomposed signal (c) Differentiated envelope for the bands (d) Overall differentiated
signal, (e) Detected peaks
For each band, the differential envelope is then obtained by discarding the
negative part through half wave rectification. Such differential envelope is shown
in Fig. 1(c). All such 20 envelopes are summed up to reconstruct the overall
differential signal which reflects the global outer shape of original signal (see
Fig. 1(d)). Peaks are detected from this summed up envelope by considering
only the local maxima. Detected peaks are circled in Fig. 1(e).
5 Experimental Results
In our experiment we have worked with four different tala-s namely, dadra, ka-
harba, tintal and jhaptal. They are of different matra. Audio clips of different tala
are obtained by recording the signals of electronic tablaa instrument. Audio clips
are of duration 10 to 20 seconds. Different signals for each tala are generated for
various Tempo. The detailed description of the data used is shown in Table 5.
Matra and Tempo Detection for INDIC Tala-s 219
Thus, the data reflects variation in terms of duration, Tempo and tala to establish
the applicability of proposed methodology in identifying matra and Tempo.
Table 6 and 7 show the accuracy in detecting matra and Tempo respectively.
In Tempo detection, a tolerance of ±2% has been considered for matching as
it is more affected by the presence of noise. In general, the experimental result
indicates the effectiveness of proposed methodology.
6 Conclusion
In this work, a simple methodology to detect two important rhythmic parameters
like matra and Tempo of music signal is proposed. The methodology works with
the perceptual features based on amplitude envelope of recorded audio clips of
electronic tablaa of different tala and Tempo. Then from the envelope the peaks
are extracted and refined subsequently to get the beat signal. Tempo can be
readily obtained from it. From the beat signal, the basic repetitive beat pattern
is identified which enables the detection of matra. Experiment with number of
tala of various Tempo indicates that the performance of proposed methodology
is satisfactory. The dataset may be enhanced to include other tala-s and the
methodology can be extended to develop a framework for bol driven identification
of theka for Indian Classical Music system.
220 S. Bhaduri, S.K. Saha, and C. Mazumdar
References
1. Sarkar, M.: A real-time online musical collaboration system for indian percussion,
33–36 (2007)
2. Raman, C.V., Kumar, S.: Musical drums with harmonic overtones. Nature, 453–454
(1920)
3. Bhat, R.: Acoustics of a cavity-backed membrane: The indian musical drum. Jour-
nal of the Acoustical Society of America 90, 1469–1474 (1991)
4. Malu, S.S., Siddharthan, A.: Acoustics of the indian drum. Technical report, Cor-
nell University (2000)
5. Chatwani, A.: Real-time recognition of tabla bols. Technical report, Senior Thesis,
Princeton University (2003)
6. Patel, A., Iversen, J.: Acoustic and perceptual comparison of speech and drum
sounds in the north indian tabla tradition: An empirical study of sound symbolism.
In: Proc. 15th International Congress of Phonetic Sciences, ICPhS (2003)
7. Gillet, O., Richard, G.: Automatic labelling of tabla signals. In: Proceedings of the
4th International Conference on Music Information Retrieval, ISMIR 2003 (2003)
8. Samudravijaya, K., Shah, S., Pandya, P.: Computer recognition of tabla bols. Tech-
nical report, Tata Institute of Fundamental Research (2004)
9. Essl, G., Serafin, S., Cook, P., Smith, J.: Musical applications of banded waveg-
uides. Computer Music Journal 28, 51–63 (2004)
10. Chorida, P.: Segmentation and recognition of tabla strokes. In: Proceedings of 6th
International Conference on Music Information Retrieval, ISMIR 2005 (2005)
11. Scheirer: Tempo and beat analysis of acoustic musical signals. Journal of the Acous-
tical Society of America 103, 588–601 (1996)
12. Klapuri, A.: Sound onset detection by applying psychoacoustic knowledge. In: Proc.
Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 3089–3092 (1999)
13. Lartillot, O., Toiviaine, P.: A matlab toolbox for musical feature extraction from
audio. In: Proc. Int. Conference on Digital Audio Effects (2007)
Modified Majority Voting Algorithm towards Creating
Reference Image for Binarization
Ayan Dey1, Soharab Hossain Shaikh1, Khalid Saeed2, and Nabendu Chaki3
1
A. K. Choudhury School of Information Technology, University of Calcutta, India
2
Faculty of Physics and Applied Computer Science,
AGH University of Science and Technology,
Poland
3
Department of Computer Science & Engineering, University of Calcutta, India
[email protected], {soharab.h.shaikh,nabendu}@ieee.org,
[email protected]
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 221
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_26, © Springer International Publishing Switzerland 2014
222 A. Dey et al.
The proposed method is a modified version of the majority voting approach found in
[1- 2] for creating a reference image for quantitative evaluation of different
binarization methods. In the present study graphic images have been considered for
performance evaluation from USC-SIPI database [18].
The basic concept behind this approach is to find an odd number of binarization
methods among all global methods for binarization. Majority voting [2] is used
subsequently for the selected subset of methods. The selection of global methods is
done by calculating the percentage deviation from the average global thresholds. The
percentage deviation for each method from an initial average of all the methods is
computed as described in the proposed algorithm. Method having maximum deviation
is discarded to eliminate any undesired bias. In fact, more than one method may be
discarded at this stage if each of such methods deviates significantly from the initial
average. A deviation parameter (DP) has been computed to implement this. If the
remaining methods for a particular image are found to be in an odd number, then
these methods would be used for creating a reference image using majority voting.
Otherwise, DP will be decremented in step 1 until an odd number of global methods
are selected or the value of DP becomes 0. With an odd number of methods left, the
majority voting is applied as explained above. If DP becomes zero then the results of
the binarization method which is closest to the average threshold would be considered
as the reference image. In this paper, a total seven binarization methods have been
taken into consideration for experimental evaluation.
Let there be n global binarization methods {M1, M2, M3, ..,Mn} and ti denotes
threshold for method Mi.
Step 1: The average threshold is calculated as
tavg= [∑ni=1 ti]/n.
Step 2: The percentage deviation has been calculated as follows:
devi = [|ti- tavg|/tavg]*100%.
Step 3: A deviation parameter (DP) is computed as:
DP = max{dev1, dev2,…, devn-1, devn} - 1.
Step 4: Select m methods from the initial n methods such that
m≤n( ≤ 1≤ ≤ )
Step 5: If mod (m, 2) equals to 0, then
If DP > 1 then
DP = DP - 1;
Go to Step 4;
224 A. Dey et al.
Else
Select Mp as the reference image where
devp = min {dev1, dev2,…, devn-1, devn};
Endif
Else
Compute reference image using majority voting by m methods
selected in Step 4;
Endif
3 Experimental Verification
Fig. 1 shows a set of test images and the results of the proposed method. The first
column represents the original gray-scale images.
The manually generated reference images have been presented as the second
column. The third column shows the results of majority voting and the results of the
proposed method have been presented in the fourth column.A number of performance
evaluation techniques have been presented in [9], [11]. In the context of the present
paper the authors have taken Misclassification error (ME), Relative Foreground Area
error (RAE) and peak signal to noise ratio (PSNR) as in [1], [11] as the performance
evaluation metrics.
A reference image has been created manually for the sake of performance
evaluation. Performance has been evaluated using images from the USC-SIPI [18]
image database. For each image, two reference images have been created; one using
the majority voting scheme and another using the proposed method. These two
reference images have been compared with the manually generated reference image to
calculate the values of different metrics. The results of the experiments have been
presented in Table 1.
Modified Majority Voting Algorithm towards Creating Reference Image for Binarization 225
The results presented in Table 1 shows that for some of the images the results of
majority voting as well as that of the proposed method are comparable and for some
cases the proposed method generates a reference image that is more close to the
manually generated reference image.
4 Conclusions
No single binarization technique so far has been found to produce consistent results
for all types of textual and graphic images. A method for creating a reference image
for quantitative evaluation of binarization method has been proposed in this paper.
Besides considering seven distinct binarization techniques for majority voting, a
deviation parameter has been calculated. Selection of the binarization methods
226 A. Dey et al.
participating in the majority voting scheme is based on how much a threshold deviates
from the deviation parameter. A strong bias due to poor computation of threshold by
one or two methods for a particular image has often had an adverse effect in
computing the threshold based on majority voting. The introduction of the deviation
parameters helps remove this bias. Experimental verification over images form a
standard dataset establishes the effectiveness of the proposed method.
References
1. Shaikh, H.S., Maiti, K.A., Chaki, N.: A New Image Binarization Method using Iterative
Partitioning. Machine Vision and Application 24(2), 337–350 (2013)
2. Shaikh, H.S., Maiti, K.A., Chaki, N.: On Creation of Reference Image for Quantitative
Evaluation of ImageThresholding Methods. In: Proceedings of the 10th International
Conference on Computer Information Systems and Industrial Management Applications
(CISIM), pp. 161–169 (2011)
3. Waluś, M., Kosmala, J., Saeed, K.: Finger Vein Pattern Extraction Algorithm. In:
International Conference on Hybrid Intelligent Systems, pp. 404– 411 (2011)
4. Le, T.H.N., Bui, T.D., Suen, C.Y.: Ternary Entropy-Based Binarization of Degraded
Document Images Using Morphological Operators. In: International Conference on
Document Analysis and Recognition, pp. 114–118 (2011)
5. Messaoud, B.I., Amiri, H., El. Abed, H., Margner, V.: New Binarization Approach Based
on Text Block Extraction. In: International Conference on Document Analysis and
Recognition, pp.1205 – 1209 (2011)
6. Neves, R.F.P., Mello, C.A.B.: A local thresholding algorithm for images of handwritten
historical documents. In: Proceedings of IEEE International Conference on Systems, Man,
and Cybernetics, pp. 2934 – 2939 (2011)
7. Sanparith., M., Sarin., W., Wasin, S.: A binarization technique using local edge
information. In: International Conference on Electrical Engineering/Electronics Computer
Telecommunications and Information Technology, pp. 698–702 (2010)
8. Tabatabaei, S.A., Bohlool, M.: A novel method for binarization of badly illuminated
document images. In: 17th IEEE International Conference on Image Processing, pp. 3573–
3576 (2010)
9. Stathis, P., Kavallieratou, E., Papamarkos, N.: An evaluation technique for binarization
algorithms. Journal of Universal Computer Scienc 14(18), 3011–3030 (2008)
10. Anjos, A., Shahbazkia, H.: Bi-Level Image Thresholding - A Fast Method. Biosignals 2,
70–76 (2008)
11. Sezgin, M., Sankur, B.: Survey over image thresholding techniques and quantitative
performance evaluation. Journal of Electronic Imaging 13(1), 146–165 (2004)
12. Gonzalez, R., Woods, R.: Digital Image Processing. Addison-Wesley (1992)
13. Kittler, J., Illingworth, J.: Minimum error thresholding. Pattern Recognition 19, 41–47
(1986)
14. Kapur, N.J., Sahoo, K.P., Wong, C.K.A.: A new method for gray-level picture
thresholding using the entropy of the histogram. Computer Vision, Graphics, and Image
Processing 29(3), 273–285 (1985)
15. Johannsen, G., Bille, J.: A threshold selection method using information measures. In: 6th
International Conference on Pattern Recognition, pp. 140–143 (1982)
Modified Majority Voting Algorithm towards Creating Reference Image for Binarization 227
16. Ridler, T., Calvard, S.: Picture thresholding using an iterative selection method. IEEE
Transaction on Systems Man Cybernetics 8, 629–632 (1978)
17. Otsu, N.: A threshold selection method from gray-level histogram. IEEE Transactions on
Systems, Man and Cybernetics 9, 62–66 (1979)
18. USC-SIPI Image Database, University of Southern California, Signal and Image
Processing Institute, http://sipi.usc.edu/database/
19. Roy, S., Saha, S., Dey, A., Shaikh, H.S., Chaki, N.: Performance Evaluation of Multiple
Image Binarization Algorithms Using Multiple Metrics on Standard Image Databases, ICT
and Critical Infrastructure. In: 48th Annual Convention of Computer Society of India, pp.
349–360 (2013)
Multiple People Tracking Using Moment
Based Approach
Sachin Kansal
Abstract. This paper has the capability to detect multiple people in indoor and
outdoor environment. In this paper we have used single camera. In this paper
we proposed a technique in which it performs multiple face detection, from this
it extracts the people’s torso regions and stores the HSV range of each person.
After this when person’s face is not in front of the camera it will track all those
people’s using moment based approach i.e. it will compute the area of exposed
torso region and centre of gravity of the segmented torso region of each person.
In this paper we consider torso region HSV range as the key feature. From this
we calculate the tracking parameters for each person. In this paper speech
thread module is implemented to have interaction with the system. Experiment
results validate the robust performance of the proposed approach.
1 Introduction
Multiple people tracking calculation in the dynamic indoor and outdoor environment,
i.e., having uncontrollable lighting conditions is an important issue to be addressed
now-a-days. There are many methods which are existing to track people by, and some
are rely on proximity sensors such as visual methods [1-3], in laser range finder
[4][5], we also get information from sonar sensors [6] . Some use concept of
localization by microphones from any sound source [7]. Single person tracking has
been implemented using face detection technique [8]. As none of the above
techniques are capable of multiple people’s tracking in the real time environment in
an efficient fashion. In these techniques we use single camera (Logitech C510). In this
paper multiple people are tracked by extracting the motion parameters. Here we use
(Right, Left) by the centre of gravity and (Away, toward) using area of segmented of
the torso region. For every sample frame we compute all the parameters.
This paper is organized as follows. Section1 briefly introduces the work which had
been implemented. Section 2 depicts method of framework. Section 3 introduces
detailed approach of multiple people’s tracking. Section 4 introduces performance of
multiple people’s tracking with different distances from the camera, having different
ambience of lightning condition and time. Finally, section 5 depicts the conclusion
and future work.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 229
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_27, © Springer International Publishing Switzerland 2014
230 S. Kansal
2 Methodology
For multiple people tracking have to identify the person’s face, i.e. target person(s)
for tracking. Then target person by detecting the torso region and extracting the HSV
range of each person separately. After this we calculate moments (zero and first order
for area and centre of gravity respectively). At last we calculate all tracking
parameters. We had also done the calibration of the camera in order to have better
accuracy while handling the threshold values. The framework of the multiple
people(s) tracking is shown in Fig. 1.
Start
Video Frame
Face
detected?
Compute HSV
range
Threshold torso
region
Moment Calculation
Calculate Centre
coordinates (x,y)
As the for multiple people tracking it needs a very high real time processing speed
and it due to this it has dynamic changing of the background throughout. In order to
extract torso region of the person(s), we implemented HSV range as feature in the
dynamic environment.
Multiple People Tracking Using Moment Based Approach 231
In each frame we detect the face of multiple people(s) to have torso based
information as features for further processing. This provides information exchange to
next subsequent module. As stated above, we use Haar-like features. In this training
has been done for hundreds of samples, sample space can be a train, toy or a face i.e.
termed as positive samples. Simultaneously we also take negative samples which are
232 S. Kansal
of same size say 50×50. As A we train the classifier and we implement to the R ROI
(region of interest). In this it results "1" if it is likely to detect the object else it w
will
show “0".
This Module works by tak king input from the previous module (ROI from the fface
detection module). We extrract torso based region of multiple people(s). Calculationn of
HSV range on some region n of interest and then set a range of HSV from it in ordeer to
provide the feature to the next
n module to track the person. For multiple People(s), we
compute HSV range for eaach person and then it will track the multiple people(s) in
real time environment.
As we have varying ligh htning conditions, we use the concept HSI as color spaace.
After extracting HSV rang ge we threshold the torso based region from the captuured
image by the camera in thee real time scenario. We also implemented operation llike
morphological closing opeerator which removes noisy. Segmentation of torso baased
region of a single person is shown in Fig. 4.
(a) (b)
(c)) (d)
Fig. 4. Image Segmentation (a) Original image, (b) Binary image of torso color, (c) A
After
g module detection
closing operation, (d) Tracking
In our approach a facial image is segmented (after region of interest is located) iinto
small grid. If the input imaage is of size M x N then single module is about of ssize
M/2×N/2. The main concept behind the segmentation is to extract main contribuuted
facial features and to reducce the search space. For example, if one expresses smilee by
stretching both lips cornerss while the other expresses only stretching one lip cornner.
Multiple People Tracking Using Moment Based Approach 233
At the time of classification all face models will be compared and maximum
similarity will be detected. If the face image is divided into very small regions, then
the global information of the face may be lost and the accuracy will be deteriorated.
Initially we extract HSV range and then even face is not detected it will track multiple
people(s). We compute moment for each frame. Steps for calculating moment are
given below:
Step 1: Camera grabbed the frame in a real time environment.
Step 2: In each frame we extract HSV range.
Step 3: After this step we extract torso region.
Step 4: After this step we threshold the torso region.
Step 5: Calculation of moments from calculated threshold image.
Step 6: From the moments we compute centre coordinates (Xc, Yc) of multiple
people for each captured frame.
3 Experiment Results
4 Discussion
Image resolution is 640×480.We have done variation in the distance between people
and the camera from 1.3 m to 3.4 m. We are streaming the frames at a speed of 5 fps
(frames per second). We are having about 98% average corrected rate and having 2%
average false positive rate. It will also track the multiple people even their faces are
not detected all the time
5 Conclusion
Calculating zero and first order moments results in all motion parameters. So that
multiple people(s) can be tracked in a real time scenario in an effective fashion.
In the future work we will implement multiple people tracking by using Open MPI
which provides parallel computing when number of person increases for tracking.
References
1. Song, K.T., Chen, W.J.: Face recognition and tracking for human-robot interaction. IEEE
International Conferences on Systems, Man and Cybernetics, The Hague, Netherlands 3,
2877–2882 (2004)
2. Kwon, H., Yoon, Y., Park, J.B., Kak, A.C.: Person tracking with a mobile robot using two
uncalibrated independently moving cameras. In: IEEE International Conferences on
Robotics and Automation, Barcelona, Spain, pp. 2877–2883 (2005)
3. Food, A., Howard, A., Mataric, M.J.: Laser based people tracking. In: Proceedings of the
IEEE International Conferences on Robotics & Automation (ICRA), Washington, DC,
United States, pp. 3024–3029 (2002)
4. Montemerlo, M., Thun, S., Whittaker, W.: Conditional particle filters for simultaneous
mobile robot localization and people tracking. In: Proceedings of the IEEE International
Conference on Robotics & Automation (ICRA), Washington, DC, USA, pp. 695–701
(2002)
5. Shin, J.-H., Kim, W., Lee, J.-J.: Real-time object tracking and segmentation using adaptive
color snake model. International Journal of Control, Automation, and Systems 4(2), 236–
246 (2006)
6. Scheutz, M., McRaven, J., Cserey, G.: Fast, reliable, adaptive, bimodal people tracking for
indoor environments. In: Proc. of the 2004 IEEE/RSJ Int. Conf. on Intelligent Robots and
Systems (IROS 2004), Sendai, Japan, vol. 2, pp. 1347–1352 (2004)
7. Fritsch, J., Kleinehagenbrock, M., Lang, S., Fink, G.A., Sagerer, G.: Audiovisual person
tracking with a mobile robot. In: Proceedings of International Conference on Intelligent
Autonomous Systems, pp. 898–906 (2004)
8. Kansal, S., Chakraborty, P.: Tracking of Person Using monocular Vision by Autonomous
Navigation Test bed (ANT). International Journal of Applied Information Systems
(IJAIS) 3(9) (2012)
Wavelets-Based Clustering Techniques for Efficient
Color Image Segmentation
Abstract. This paper introduces efficient and fast algorithms for unsupervised
image segmentation, using low-level features such as color and texture. The
proposed approach is based on the clustering technique, using 1. Lab color
space, and 2. the wavelet transformation technique. The input image is
decomposed into two-dimensional Haar wavelets. The features vector,
containing the information about the color and texture content for each pixel is
extracted. These vectors are used as inputs for the k-means or fuzzy c-means
clustering methods, for a segmented image whose regions are distinct from each
other according to color and texture characteristics. Experimental result shows
that the proposed method is more efficient and achieves high computational
speed.
1 Introduction
Image segmentation has been a focused research area in the image processing, for the
last few decades. Many papers has been published, mainly focused on gray scale
images and less attention on color image segmentation, which convey much more
information about the object or images. Image segmentation is typically used to locate
objects and boundaries in images. More precisely, image segmentation is the process
of assigning a label to every pixel in an image such that the pixels with the same label
share certain visual characteristics. The color image segmentation plays crucial role in
image analysis, computer vision, image interpretation, and pattern recognition system.
The CIELAB color space is an international standard where the Euclidean
distance between two color points in the CIELAB color space corresponds to the
difference between two colors by the human vision system [1]. This property
CIELAB color space is attractive and useful for color analysis. The CIELAB color
space has shown better performance in many color image applications [2,4]. Because
of these the CIELAB color space has been chosen for color clustering.
Clustering is an unsupervised technique that has been successfully applied to
feature analysis, target recognition, geology, medical imaging and image
segmentation [5,7]. This paper considers the segmentation of image regions based on
two clustering methods: 1. color feature using fuzzy c-means or k-means, 2. color and
texture feature using fuzzy c-means or k-means through Haar wavelet transformation.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 237
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_28, © Springer International Publishing Switzerland 2014
238 P. Bhattacharya, A. Biswas, and S.P. Maity
The Wavelets are useful for hierarchically decomposing functions in ways that are
both efficient and theoretically sound [8]. Through Wavelet transformation on the
color image, four sub images like low resolution copy of original image and three-
band pass filtered images in specific directions: horizontal, vertical and diagonal, will
be produced. In this paper, we use Haar wavelets to compute feature signatures
because they are the fastest and simplest to compute and have been found to perform
well in practice [9]. The Haar wavelet is a certain sequence of rescaled ‘square-
shaped’ functions which together form a wavelet family or basis. The Haar wavelet
transformation technique is explained in the paper [13].
K-Means Algorithm
K-means algorithm was originally introduced by McQueen in 1967 [11]. The K-
means algorithm is an iterative technique that is used to partition an image into K
clusters.
Let X = {x1, x2,..., xn} represent a set of pixels of the given image, where n is the
number of pixels. V = {v1, v2,..., vk} is the corresponding set of cluster centres, where
k is the number of clusters. The aim of K-means algorithm is to minimize the
objective function J(V), in this case a squared error function:
J(V) = ∑ ∑ x v (1)
Where, x v is the Euclidean distance between xij and vj.ki is the number of
pixels in the cluster i.
Wavelets-Based Clustering Techniques for Efficient Color Image Segmentation 239
The difference is typically based on pixel colour, intensity, texture, and location, or a
weighted combination of these factors. In our study, we have considered pixel
intensity. The ith cluster centre vi can be calculated as:
This algorithm is guaranteed to converge, but it may not return the optimal solution.
The quality of the solution depends on the initial set of clusters and the value of k.
Where, x v is the Euclidean distance between xij and vj. µ ij is the membership
degree of pixel xi to the cluster centre vj and μij has to satisfy the following conditions:
∑ µ = 1 , ∀i = 1,...n (5)
= (6)
∑ ( )
∑ ( )
= ∑
(7)
( )
iv) Repeat step (ii) to (iii) until the minimum J value is achieved.
6 Experimental Results
The algorithms are implemented on Matlab. The four algorithms, viz., K-Means,
Wavelet-based Fuzzy C-Means, Fuzzy C-Means and, Wavelet-based K-Means are
implemented on image databases. The K-means and FCM algorithm based on color
segmentation are described in section IV of A and B, has been tested on L*a*b* color
242 P. Bhattacharya, A. Biswas, and S.P. Maity
space and their results are shown in Fig. 1 and Fig. 3. Color and texture based image
segmentation are described using Wavelet-based K-Means (WKM) and Wavelet-
based Fuzzy-C–Means Clustering (WFCM) in section V of C and D, and their tested
results are shown in Fig. 4 and Fig. 2.
In this paper we have presented algorithms for segmentation of color images using
Fuzzy C-Means (FCM), K-Means, Wavelet based K-Means (WKM) Wavelet Based
Fuzzy C-Means (WFCM), described in Section 4 and Section 5, have been developed.
The experimental results in Fig. 1 to 4, By observing the results it can be said
that proposed method can be successfully applied on different types of images.
Using wavelet and clustering technique for color image segmentation, it not only
enhances but also execution time of each image takes less time than single clustering
technique.
References
1. Wyszecki, G., Stiles, W.S.: Color science: concepts and methods, quantitative data and
formulae, 2nd edn. John Wiley and Sons (2000)
2. Jin, L., Li, D.: A Switching vector median based on the CIELAB color space for color
image restoration. Signal Processing 87, 1345–1354 (2007)
3. Hartigan, J.A., Wong, M.A.: A K-means clustering algorithm. Appl. Stat. 28(1), 100–108
(1979)
4. Kwak, N.J., Kwon, D.J., Kim, Y.G., Ahn, J.H.: Color image segmentation using edge and
adaptive threshold value based on the image characteristics. In: Proceedings of
International Symposium on Intelligent Signal Processing and Communication Systems,
pp. 255–258 (2004)
5. Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic-Theory and Applications. PHI (2000)
6. Bezdek, J.C., Ehrlich, R., Full, W.: FCM: The Fuzzy c-Means clustering algorithm.
Computers and Geosciences 10, 191–203 (1984)
244 P. Bhattacharya, A. Biswas, and S.P. Maity
7. Chen, T.W., Chen, Y.L., Chien, S.Y.: Fast image segmentation based on K-Means
clustering with histograms in HSV color space. In: Proceedings of IEEE International
Workshop on Multimedia Signal Processing, pp. 322–325 (2008)
8. Antonini, M., Barlaud, M., Mathieu, P., Daubechies, I.: Image Coding using Wavelet
Transform. IEEE Transaction on Image Processing 1(2), 205–220
9. Kherfi, M.L., Ziou, D., Bernardi, A.: Laboratories Universities Bell. Image Retrieval from
the World Wide Web: Issues, Techniques, and Systems. ACM Computing Surveys 36(1),
35–67 (2004)
10. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Pearson Education
(2000)
11. MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate
Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and
Probability, pp. 281–297. University of California Press (1967)
12. Bezdek, J.C.: Pattern Recognition and Fuzzy Objective Function Algorithms. Plenum
Press (1981)
13. Castillejos, H., Ponomaryov, V.: Fuzzy Image Segmentation Algorithms in Wavelet
Domain. In: 2011 8th International Conference on Electrical Engineering Computing
Science and Automatic Control (2011)
An Approach of Optimizing Singular Value of YCbCr
Color Space with q-Gaussian Function in Image
Processing
1
Department of Computer Science and Engineering,
National Institute of Technology, Agartala, India
2
Department of Information Technology,
Bengal Engineering and Science University, Shibpur, India
[email protected]
1 Introduction
Radial Basis Functions (RBFs) of Artificial Neural Network (ANN) are utilized in
various areas such as pattern recognition, optimization etc. In this paper we have
chosen Radial Basis Function such as Gaussian, Multi Quadratic Radial, Inverse Multi
Quadratic, Cosine and q-Gaussian. Singular values of colour images are calculated
and compared with such RBFs in this paper. Colour image is taken as example and
their YCbCr colour space components are being introduced for the analysis.
Gaussian, Multi Quadratic Radial, Inverse Multi Quadratic, Cosine and q-Gaussian
are computed and compared with normal methods. In section 2 architecture of RBF
neural network is described. Section 3 shows the analysis. In section 4 simulations is
given and finally in Section 5 we have given the conclusion [1-4], [7].
*
Corresponding author.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 245
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_29, © Springer International Publishing Switzerland 2014
246 A. Paul, P. Bhattacharya, and S.P. Maity
In Fig. 1 RBF neural network architecture consist of three layers: the input layer,
hidden layer and output layer. In Fig. 1 inputs are X = {x1,... xd} which enter into
input layer. Radial centres and width are C = {c1......... ,cn}T and σi respectively. In
hidden layer Φ = { Φ1,......, Φn }are the radial basis functions. Centres are of n×1
dimension when the number of input is n. The desired output is given by y which is
calculated by proper selection of weights. Ω = {w11,...,w1n,....,wm1,....wmn} is the
weight. Here, wj. is the weight of ith centre [2-3].
m
y = i =1
w iφ i (1)
Radial basis functions like Linear, Cubic, Thin plane spline and Gaussian are given
in the in the Eqn.2, Eqn.3, Eqn4, and Eqn.5 respectively.
(r ) 2
Gaussian: φ ( x) = exp(− ) (3)
2σ 2
σ
Cosine: φ ( x) = (6)
r2 +σ 2
r2
q-Gaussian: φ ( x ) = exp( − ) (7)
(3 − q )σ 2
3 Analysis
We have taken one colour image. After that we extracted the red, green and blue
colour components from the colour images Fig. 2. After that we calculated Y, Cb and
Cr colour component [1] from the Eq.8, Eq.9 and Eq.10. Relation between Y, Cb and
Cr colour components and R, G and B colour component is given in Eq.6. As we have
applied. Gaussian, Multi Quadratic Radial, Inverse Multi Quadratic, Cosine and q-
Gaussian RBF in artificial neural network, RBF needs optimal selections weight and
centres. We have calculated all the method with pseudo inverse technique [5]. As in
put we have chosen House image. Matrix sizes of all the input images are of 128×128
pixels for every analysis and experiment.
(a) (b)
(b) (d)
Fig. 3. House image (a) Original Color Image, (b) Y Component, (c) Cb Component, (d) Cr
Component
Table 1. Mean Singular value of House image with different RBF methods
Y Cb Cr
RBF methods
Component Component Component
GRBF 2.352351957037448 2.26704209518650 1.36048170627175
MQRBF 2.352351960698278 2.26704210092544 1.36048143482934
IMQRBF 2.352351957007302 2.26704209550038 1.36048174302319
CRBF 2.352351957008633 2.26704209548849 1.36048174199634
q-GRBF 2.352351957007182 2.26704209547479 1.36048174215049
Table 2. Mean Square Error (MSE) of Singular value of House image with different RBF
methods
Y Cb Cr
RBF methods
Component Component Component
-17 -16 -15
GRBF 5.587050298754× 10 2.390966601750×10 3.759190991746×10
-13 -11 -9
MQRBF 7.903088842942× 10 1.199463110908×10 3.053300779145×10
-20 -18 -15
IMQRBF 8.704202668210× 10 1.583915586350×10 2.695149918667×10
-20 -18 -15
CRBF 8.234144691833× 10 1.807522296961×10 2.064707816520×10
-21 -20 -17
q-GRBF 1.010905951693× 10 1.264231758406× 10 6.797597647388×10
4 Simulation
Colour images with YCbCr colour components are taken for the simulation. In this
paper we have experimented through House colour image which the size of 128×128
pixels. We have simulated singular value of matrixes of these images with normal
method, Gaussian, Multi Quadratic Radial, Inverse Multi Quadratic, Cosine and q-
Gaussian methods. We have used MATLAB 7.6.0 software [6] for the analysis and
simulation process. In Fig. 3, House image and its corresponding YCbCr colour
component images are shown. Mean singular values of matrixes of all these images
with Gaussian RBF, Multi Quadratic RBF and Inverse Multi Quadratic RBF, Cosine
RBF and q-Gaussian RBF are calculated, simulated. Comparative singular values are
shown in Table.1. Corresponding Mean Square Error of singular values of matrixes
are shown in Table 2. In Fig. 4 graphically represents the Mean sequence error of
GRBF, MCRCF, IMCRBF, CRBF, q-GRBF.
Fig. 4. Comparison of Error of Eigen value with YCbCr color components in House image
250 A. Paul, P. Bhattacharya, and S.P. Maity
5 Conclusion
In this paper Gaussian, Multi Quadratic Radial, Inverse Multi Quadratic, Cosine and
q-Gaussian Radial Basis Functions are utilized for the approximation of singular
values of image and their corresponding YCbCr colour component matrixes.
Simulation results give better result and lesser Mean Square Error for q-Gaussian
RBF method. So, it can be conclude that q-Gaussian RBF method could be used for
the approximation compared to the other relative methods in Artificial Neural
Network.
References
1. Noda, H., Niimi, M.: Colorization in YCbCr color space and its application to JPEG
images. Pattern Recognition 40(12), 3714–3720 (2007)
2. Scholkopf, B., Sung, K.-K., Burges, C.J.C., Girosi, F., Niyogi, P., Poggio, T., Vapnik, V.:
Comparing support vector machines with Gaussian kernels to radial basis function
classifiers. IEEE Transactions on Signal Processing 45, 2758–2765 (1997)
3. Mao, K.Z., Huang, G.-B.: Neuron selection for RBF neural network classifier based on data
structure preserving criterion. IEEE Transactions on Neural Networks 16(6), 1531–1540
(2005)
4. Luo, F.L., Li, Y.D.: Real-time computation of the eigenvector corresponding to the smallest
Eigen value of a positive definite matrix. IEEE Transactions on Circuits and Systems 141,
550–553 (1994)
5. Klein, C.A., Huang, C.H.: Review of pseudo-inverse control for use with kinematically
redundant manipulators. IEEE Transactions on System, Man, Cybernetics 13(3), 245–250
(1983)
6. Math Works. MATLAB 7.6.0, R2008a (2008)
7. Fernandez-Navarroa, F., Hervas-Martineza, C., Gutierreza, P.A., Pena-Barraganb, J.M.,
Lopez-Granadosb, F.: Parameter estimation of q-Gaussian Radial Basis Functions Neural
Networks with a Hybrid Algorithm for binary classification. Neurocomputing 75(1), 123–
134 (2012)
Motion Tracking of Humans under Occlusion
Using Blobs
1 Introduction
Video Surveillance is one of the major research field in video analytics. The aim of
video surveillance is to collect information from videos by tracking the people
involved in them. For Security purpose, Video Surveillance has been employed in all
public places. Over a long period of time, a human guard would be unable to watch
the behavior of the persons from the videos. To avoid the manual intervention,
automatic video surveillance has been emerged where people/objects could be
identified and tracked effectively.
Numerous tracking approaches have been proposed in the literature to track objects
[1,4,6,12].Tracking an object/human can be complex due to the existence of noise
over the images, partial and total occlusion, distorted shapes and complex object
motion. An Object Tracking mechanism starts with the Object detection stage which
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 251
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_30, © Springer International Publishing Switzerland 2014
252 M. Sivarathinabala and S. Abirami
2 Related Works
Object Tracking includes the following stages: Background modelling, Segmentation,
Object representation, Feature selection and tracking using filters. Major works in the
field of video analytics have been devoted in the object tracking phase and they are
addressed in this section.Mixture of Gaussians (MOG) background model [8] is
considered to be more effective in modelling multi-modal distributed backgrounds.
Parisa Darvish et al. [11] proposed a region based method for background
subtraction. This method has been done based on color histograms, texture
information and successive division of candidate rectangular image regions to model
the background. This method provides the detail of the contour of the moving shapes.
Stauffer et al. [7] proposes Gaussian mixture background model based on the mixture
modelling of pixel sequences by considering the time continuity of the pixels, but this
does not combine the information in the spatial neighbourhood of the pixels. Several
techniques for foreground segmentation [1,13,5] are available, among them
background subtraction [4] seems to be the simpler method. In background
subtraction, current frame has been subtracted from the previous frame. This
segmentation approach has been suited for static environments. Hajer Fradi et al. [12]
proposed a segmentation approach, background subtraction based on incorporating a
uniform motion model into GMM. Tracking can be defined as the problem of
Motion Tracking of Humans under Occlusion Using Blobs 253
3 Object Tracking
In this research, to detect/track the objects, frames are extracted from input videos.
Temporal information which reveals the moving region details are obtained from
those sequence of frames. A suitable background model has been selected using
GMM mixture of Gaussians and the background information gets updated constantly.
Later, suitable features are selected to represent every object and by employing blob
and particle filter, trajectories are identified for those objects.
( )= ∑ , ( , , ,∑ , ) (1)
Where k denotes the number of distributions denotes each pixel value of tth image
frame. , denotes the weight factor for corresponding Gaussian distribution. ,
denotes the mean value of ith Gaussian. ∑ , represents the co-variance matrix of ith
Gaussian. η represents the Gaussian probability density function. Threshold is
predefined here through which binary motion detection mask has been calculated. For
each frame, the difference of pixel value from mean has been calculated to derive
pixel color match component. If the match component is zero, then the Gaussian
components for each pixel has been updated else if it is one, Updation of weight,
mean and standard deviation has been done. This enables the new background to be
get detected and updated subsequently.
254 M. Sivarathinabala and S. Abirami
3.2 Segmentation
Once background has been modelled, it needs to get subtracted from the frames to
obtain the foreground image. Background subtraction [3] has been performed by
taking the difference between the current image and the reference background image
in a pixel by pixel fashion. Here, threshold value has been fixed. When the pixel
difference is above threshold value, it is considered as a foreground image. Blobs or
pixels segmented as foreground are necessary objects which need to get represented
to track its motion. This object could be either a particular thing/vehicle/person etc.
as the same or existing object and the person has been moving in contiguous fram
mes.
Figure 1(a) shows the mo ovement of the human blob (Person 1) getting traccked
continuously with the inteerference of their circles. Area of the circle has bbeen
computed for further processsing.
Fig. 1. (a) Bolb creation of two persons Fig. 1. (b) Crossover of bolbs during motiion
moving in the same direction
n of two persons in opposite directions
Fig. 1(b) shows the existence of two different objects in the same frame. Both the
objects are tracked simultaaneously using the circles lying inside their contours. F
Fig.
1(b) shows the existence ofo two separate blobs being identified; hence two perssons
need to get tracked separaately. According to this figure, two persons are walkking
towards each other. Initially
y, Person 1 and Person 2 are considered as new objects and
they are tracked simultaneoously. At some point, crossover has been happened betwween
them. When one circle oveerlaps with two other circles of different directions, bblob
merging occurs. Blob merrging represents the first occurrence of occlusion. Ussing
particle filters, a new traccking hypothesis has been created to track the occluuded
persons (Person 1 and 2 in this
t case).
Equation (4) gives the distance measurement between two distributions p and q. ρ is
the Bhattacharya co-efficient, m is the histogram bin number, q is the target color
distribution, p(n) is the sample hypothesis color distribution.
E ( s ) = n π ( n ) s ( n) (5)
Sample set hypothesis distribution has been updated based on mean E(s). E(s)
produces the estimated new object locations using mean and mode. Using this, the
new position of the object/person has been identified accurately by tracking the path
of the particles. Hence, target person has been tracked accurately under occlusion
state also.
Fig. 2. a) An input image and b) its Fig. 3. a) An input frame and b) object
segmented frame representation
0.6 15
0.4
10
0.2
5
0
40
20 50
40 0
0 30
-20 20
10
-40 0 -5
0 5 10 15 20 25 30 35 40 45 50
Fig. 4. a). Evolution of State density using Particle Filter, b)Estimated State using Particle Filter
and c) Filtered Observation, all in sequence
Motion Tracking of Humans under Occlusion Using Blobs 257
Test case 2:
User generated videos has been considered in the test case 2. But in this scenario, a
person walks in the terrace and returns back to his initial position and at the same
time, another person walks on the same path as shown in figures 5(a). Our proposed
algorithm tracks the multiple person successfully even under occlusion. Figure 6(a),
6(b) shows the circle representation on human Figure 7(a), 7(b) and 7(c) depict the
movement of humans through particle paths of Estimated state and Filtered
observation using Particle Filter.
Fig. 5. a) Input image and its Segmented Fig. 5. b) The blob spliting and merging
Frame using the circles drawn
10
15
0
10
-10
5
-20 0
-30 -5
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
Fig. 6. a)Evolution of State Density using Particle Filter, b). Estimated State using Particle
Filter and c)Filtered observation using Particle Filter, all in sequence
Performance of this system has been measured using Occlusion detection rate. This
can be defined as the number of occluded objects tracked to the total number of
occluded objects. The Proposed algorithm detects the person when he/she is occluded
partially and the detection rate has been tabulated for two different datasets as in
Table 1.
7 Conclusion
In this research, an automated tracking system has been developed using blob and
particle filters to detect partial occlusion states. The proposed algorithm has been
258 M. Sivarathinabala and S. Abirami
tested over CAVIAR and user generated datasets and the results are promising. This
system has the ability to track the objects even if there is an object crossover also. In
future, this system could be extended along with the detection of full occlusion and
multiple objects tracking too.
References
1. Yilmaz, A., Javed, O., Shah, M.: Object Tracking: A Survey. ACM Computing
Surveys 38(4) (2006)
2. Jahandide, H., Pour, K.M., Moghaddam, H.A.: A Hybrid Motion and Appearance
prediction model for Robust Visual Object Tracking. Pattern Recognition Letter 33(16),
2192–2197 (2012)
3. Bhaskar, H., Maskell, L.M.S.: Articulated Human body parts detection based on cluster
background subtraction and foreground matching. Neurocomputing 100, 58–73 (2013)
4. Manjunath, G.D., Abirami, S.: Suspicious Human activity detection from Surveillance
videos. International Journal on Internet and Distributed Computing Systems 2(2), 141–
149 (2012)
5. Gowshikaa, D., Abirami, S., Baskaran, R.: Automated Human Behaviour Analysis from
Surveillance videos: a survey. Artificial Intelligence Review (April 2012), doi:10.1007/s
10462-012-9341-3
6. Stauffer, C., Grimson, E.E.L.: Learning patterns of activity using real-time tracking.
Proceedings of IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8),
747–757 (2000)
7. Huwer, S., Niemann, H.: Adaptive Change Detection for Real-time Surveillance
applications. In: the Proceedings of 3rd IEEE Workshop on Visual Surveillance, pp. 37–45
(2000)
8. Haibo, H., Hong, Z.: Real-time Tracking in Image Sequences based-on Parameters
Updating with Temporal and Spatial Neighbourhoods Mixture Gaussian Model.
Proceedings of World Academy of Science, Engineering and Technology, 754–759 (2010)
9. Lu, J.-G., Cai, A.-N.: Tracking people through partial occlusions. The Journal of China
Universities of Post and Telecommunications 16(2), 117–121 (2009)
10. Cho, N.G., Yuille, A.L., Lee, S.W.: Adaptive Occlusion State estimation for human pose
tracking under self-occlusions. Pattern Recognition 46(3) (2013)
11. Varcheie, P.D.Z., Sills-Lavoie, M., Bilodeau, G.-A.: A Multiscale Region-Based Motion
Detection and Background Subtraction Algorithm. The Proceedings of Sensor
Journal 10(2), 1041–1061 (2010)
12. Fradi, H., Dugelay, J.-L.: Robust Foreground Segmentation using Improved Gaussian
Mixture Model and Optical flow. In: The Proceedings of International Conference on
Informatics, Electronics and Vision, pp. 248–253 (2012)
13. Wang, Y., Tang, X., Cui, Q.: Dynamic Appearance model for particle filter based visual
tracking. Pattern Recognition 45(12), 4510–4523 (2012)
Efficient Lifting Scheme Based Super Resolution Image
Reconstruction Using Low Resolution Images
Abstract. Super resolution (SR) images can improve the quality of the multiple
lower resolution images. it is constructed using raw images like noisy, blurred
and rotated. In this paper, Super Resolution Image Reconstruction (SRIR)
method is proposed for improving the resolution of lower resolution (LR)
images. Proposed method is based on wavelet lifting scheme with
Daubechies4 coefficients. Experimental results prove the effectiveness of the
proposed approach. It is observed from the experiments that the resultant
reconstructed image has better resolution factor, MSE and PSNR values.
1 Introduction
Images [1] are obtained for many areas like a remote sensing, medical images,
microscopy, astronomy, weather forecasting. In each case there is an underlying
object or science we wish to observe, the original or true image is the ideal
representation of the observed scene [1], [2]. There is uncertainty in the measurement
occurring as noise, bluer, rotation and other degradation in the recorded images. There
for remove these degradation using Super Resolution technology [5]. In super
resolution technology in image processing area to get a High Resolution (HR) image.
The central aim of Super Resolution (SR)[5] is to enhance the spatial resolution of
multiple lower resolution images. HR means pixel density within the image is high
and indicates more details about original scene. The super resolution technique is an
efficient lossy and low cost technology. In this paper we are using Wavelet Transform
(WT) technique to get an HR image from Low Resolution (LR) images by involving
image blurring, registration, deblurring, denoising and interpolation.
The mathematical modeling of SRIR is done from several low resolution images.
Low resolution means the lesser details of image. The CCD discretizes the images
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 259
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_31, © Springer International Publishing Switzerland 2014
260 S.R. Dogiwal, Y.S. Shishodia, and A. Upadhyaya
and produces digitized noisy, rotated and blurred images. The images are not sampled
according to the Nyquist criterion by these imaging systems. As a result, image
resolution diminishes due to reduction in the high frequency components.
We give the model for super resolution reconstruction [6] for set of low resolution
images. The problem has been stated by casting a low resolution restoration frame
work are P observed images , each of size M1xM2 which are decimated,
blurred and noisy versions of a single high resolution image Z of size N1xN2 where
N1=qM1 andN2=qM2 After incorporating the blur matrix, and noise vector, the image
formation model is expressed as Ym = Hm DZ + ηm where m =1, …, P
Here D is the decimation matrix of size M1M2xq2M1M2, H is the PSF of size
M1M2xM1M2, ηm is M1M2x1 noise vectors and P is the number of low resolution
observations. Stacking P vector equations from different low resolution images into a
single matrix vector gives
η
. . .
. = . Z + .
. . .
η
3 Lifting Schemes
The wave lifting scheme [10] is a method for decomposing wavelet transform into a
set of stages. The forward lifting wavelet transform divides the data being processed
into an even half and odd half. The forward wavelet transform expressed in the lifting
scheme is shown in Fig. 1.
evenj-1
Sj-1
+mb
split P U
-bbmb
oddj-1 dj-1
The predict step calculates the wavelet function in the wavelet transform. In split
step sort the entries into the even and odd entries. Generally prediction procedure p and
then compute dj-1 = oddj-1 – p(evenj-1). Update U for given entry, the prediction is
made for the next entry has the small values and difference is stored. Updates are Sj-
1[n] = Sj[2n] + dj-1[n]/2 and Sj-1 = evenj-1+ U(dj-1).This is high pass filter. The update
step calculates the scaling function, which results in a smoother version of the data.
This is low pass filter.
The input data in the split step is separated into even and odd elements. The even
elements are kept in to , the first half of an N element array section. The odd
elements in , to ,, the second half of an N element array section. The term LL
refers to low frequency components and LH, HL, HH represents the high frequency
components in the horizontal, vertical and diagonal directions respectively.
The forward step are Update1 (UI): for n=0 to half-1
√ √
S[n]=S[n]+√3S[half+n], Predict (PI): S[half]=S[half]- S[0]- S[halh-1}
For n=1 to half-1
√ √
S[half]=S[half]- S[n]- S [n-1], Update2 (U2): For n=0 to half-2
S[n]=S[n]-S[half+n+1], S[half-1]=S[half-1]-S[half]
Normalize (N): for n=0 to half-1
√ √
S[n]= S[n] and S[n+half]= S[n+half]
√ √
The backward transform, the subtraction and addition operation are interchanged.
transform coefficient first, and transmits the bits so that an increasing refined copy of
the original image can be obtained progressively.
In the decomposition, SPIHT allows three lists of coordinates of coefficients. In the
order, they are the List of Insignificant Pixels (LIP), the List of Significant Pixels
(LSP), and the list of Insignificant Sets (LIS). At a certain threshold, a coefficient is
considered significant if its magnitude is greater than or equal to threshold. The LIP,
LIS, and LSP concept can be defined based on the idea of significance.
This working diagram fig. 3 is the way of doing our work. This is the step by step
procedure that how we implement the images and how we evaluate the working
parameters. In this approach three input low resolution blurred, noise, rotated images
are considered. These images are registered using FFT based algorithms [7]. The
registered low resolution images are decomposed using lifting schemes to specified
levels. At each level we will have one approximation i.e. LL sub band and three detail
sub band, i.e. LH, HL, HH coefficients. Each low resolution image is encoded using
SPHIT scheme. The decomposed images are fused are using the fusion rule and
inverse lifting scheme is applied to obtained fused image. The fused image is decoded
using DSPHIT [8]. Restoration is performing in order to remove the bluer and noise
from the image. We obtained the super resolution image using wavelet based
interpolation.
5 Measurement of Performance
Mean Square Error (MSE) and Peak Signal to Noise Ratio (PSNR) are the two
objective image quality measures used here to measure the performance of super
resolution algorithm. MSE gives the cumulative squared error between the original
image and reconstructed image and PSNR gives the peak error.TheMSE andPSNRof
the reconstructed image are [11]
∑ (, ) (, )
MSE =
Where f (i, j) is the source image and F(I, J) is the reconstructed image, containing
N xN pixels, and for 8bits representations.
PSNR=20 ( )
6 Results
In simulation, the proposed method was tested on several images. Results on two such
images are shown in Fig. 4, Fig. 5 and Table 1. In Fig. 4 and Fig. 5a high resolution
source image (512, 512) is a considered, from source image three noisy, blurred and
rotated mis-registered LR images (256, 256) are created. We have tested using
rotated version of source image by an angle of 15 degrees. It was noted that rotation
was checked for different angles and results coalesce, the noisy image derived from
source image by adding an additive white Gaussian noise with variance 0.05, the
blurred version of source image developed by convolution with an impulse response
of [1 2 2 2 2 2 2 2 2 2 1]. The results were verified for many impulse responses. The
images were Pre-processed in image registration with arguments such that it applies
to only rotational, translational and scaled images. By overlaying with reference
images provided, the orientation of rotated image aligns similar to that of reference
image. In figure (b) super resolution reconstructed images after reducing blur by blind
deconvolution and iterative blind deconvolution respectively and are interpolated to
twice the samples with the increase in the image size(512, 512). Experimental results
show that proposed method produces better results as compared to simple DWT
264 S.R. Dogiwal, Y.S. Shishodia, and A. Upadhyaya
method. For example, MSE and PSNR values are improved from 36.682 and 32.246
to 36.166 and 32.548 respectively forLand Image. Table 1 shows the comparison
between lifting scheme (proposed) based super resolution reconstruction with DWT
based super resolution reconstruction Daubechies (D4) wavelets.
(a) (b)
Fig. 4. (a) Three LR images (noisy, blurred and rotated) and three decomposed images (noisy,
blurred and rotated) and are fused using LWT (b) is the super resolution reconstructed image
(a) (b)
Fig. 5. (a) Three LR images (noisy, blurred and rotated) and three decomposed images (noisy,
blurred and rotated) and are fused using LWT (b) is the super resolution reconstructed image
7 Conclusion
Super resolution images enhance the quality of the multiple lower resolution images
like noisy, blurred and rotated. Super resolution images increase the recognition rate
in various applications like health domain, security applications etc. This paper
proposed a technique for improving quality of lower resolution images namely Super
Resolution Image Reconstruction (SRIR). Proposed method is based on wavelet
lifting scheme. Experiments are performed with various input images like noisy,
blurred and rotated images. Experimental results show that proposed method for
construction of super resolution image performs well for super resolution image
construction. In future, work can be performed feature extraction for LR imagesusing
super resolution images with the implementation of Gabor transform.
References
1. Gonzalez, R.C., Woods, R.C.: Digital Image Processing. Prentice Hall (2002)
2. Jayaraman, S., Esakkirajan, S., Veerakumar, T.: Digital Image Processing.Tata McGraw
Hill Education Pvt. Ltd. (2009)
3. Morales, A., Agili, S.: Implementing the SPIHT Algorithm in MATLAB. In: Proceedings
of ASEE/WFEO International Colloquium (2003)
4. Hu, Y., et al.: Low quality fingerprint image enhancement based on Gabor filter. In: 2nd
International Conference on Advanced Computer Control (2010)
5. Chaudhuri, S.: Super-resolution imaging. The Springer International Series in Engineering
and Computer Science, vol. 632 (2001)
6. Kumar, C.N.R., Ananthashayana, V.K.: Super resolution reconstruction of compressed low
resolution images using wavelet lifting schemes. In: Second International Conference on
Computer and Electrical Engineering, pp. 629–633 (2009)
7. Castro, E.D., Morandi, C.: Registration of translated and rotated images using finite
Fourier transform. IEEE Transactions on Pattern Analysis and Machine Intelligenc 9(5),
700–703 (1987)
8. Malỳ, J., Rajmic, P.: DWT-SPIHT Image Codec Implementation.Department of
telecommunications, Brno University of Technology, Brno, CzechRepublic
9. Solomon, C., Breckon, T.: Fundamentals of Digital Image Processing: A practical
approach with examples in Matlab. Wiley (2011)
10. Jiji, C.V., Joshi, M.V., Chaudhuri, S.: Single-frame image super-resolutionusing learned
wavelet coefficients. International Journal of Imaging Systems andTechnology 14(3), 105–
112 (2004)
11. Jensen, A., la Cour-Harbo, A.: Ripples in mathematics: The discrete wavelets transform.
Springer (2001)
12. Ananth, A.G.: Comparison of SPIHT and Lifting Scheme Image CompressionTechniques
for Satellite Imageries. International Journal of Computer Applications 25(3), 7–12 (2011)
13. Wang, K., Chen, B., Wu, G.: Edge detection from high-resolution remotely sensed
imagery based on Gabor filter in frequency domain. In: 18th International Conference on
Geoinformatics (2010)
266 S.R. Dogiwal, Y.S. Shishodia, and A. Upadhyaya
14. Reddy, B.S., Chatterji, B.N.: An FFT-based technique for translation,rotation, and
scaleinvariant image registration. IEEE Transactionson Image Processing 5(8), 1266–1271
(1996)
15. Zhu, Z., Lu, H., Zhao, Y.: Multi-scale analysis of odd Gabor transform for edge detection.
In: First International Conference on Innovative Computing, Information and Control
(2006)
16. Zhang, D., Jiazhonghe: Face super-resolution reconstruction and recognition from low –
resolution image sequences. In: 2nd International Conference on Computer Engineering
and Technology, pp. 620–624 (2010)
Improved Chan-Vese Image Segmentation Model
Using Delta-Bar-Delta Algorithm
Abstract. The level set based Chan-Vese algorithm primarily uses re-
gion information for successive evolutions of active contours of concern
towards the object of interest and, in the process, aims to minimize
the fitness energy functional associated with. Orthodox gradient descent
methods have been popular in solving such optimization problems but
they suffer from the lacuna of getting stuck in local minima and of-
ten demand a prohibited time to converge. This work presents a Chan-
Vese model with a modified gradient descent search procedure, called
the Delta-Bar-Delta learning algorithm, which helps to achieve reduced
sensitivity for local minima and can achieve increased convergence rate.
Simulation results show that the proposed search algorithm in conjunc-
tion with the Chan-Vese model outperforms traditional gradient descent
and recently proposed other adaptation algorithms in this context.
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 267
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_32, c Springer International Publishing Switzerland 2014
268 D. Mandal, A. Chatterjee, and M. Maitra
local minima. Thus to alleviate the problem of getting stuck to local minima,
this work proposes a modified gradient search technique that guarantees better
and faster convergence of the C-V algorithm towards its global minimum to
get accurate segmentation results compared to the well established heuristic
searches. The present work mainly rests on the efficacy of a variant of GDS,
namely, the Delta-Bar-Delta rule (DBR), proposed by Jacobs [5], which utilizes
a learning parameter update rule, in each iteration, in addition to the weight
update rule. This method bears similarity with the RPROP [3] method, although
the update rules are quite distinct in nature. In this work, we propose a modified
version of DBR algorithm, namely MDBR algorithm, which utilizes a modified
version of DBR algorithm to update learning rate parameters and momentum
method to update weights, to achieve even faster convergence. The proposed
Chan-Vese-MDBR algorithm has been utilized to segment both scalar and vector
valued images and its superiority has been firmly established in comparison
with other popular search methods used for level set based image segmentation
algorithms e.g. the well established basic GDS model and recently proposed
momentum (MOMENTUM), resilient backpropagation (RPROP) and conjugate
gradient (CONJUGATE) based learning methods.
The outline of this paper is as follows: In Section 2 we describe the basic C-V
model. In Section 3 we describe the fundamental search procedure of the GDS
method. In Section 4 our proposed modified GDS algorithm has been presented.
In Section 5, segmentation results of various gray-scale and colour images em-
ploying MDBR have been illustrated. Performance comparisons of MDBR with
some other tools have also been highlighted in this section. Section 6 concludes
the present work.
where, φ is the level set function, H and δ0 [1] are the Heaviside and one-
dimensional Dirac functions and the constants c1 and c2 [1] are, respectively,
the mean intensity values within the regions Ω1 and Ω2 . Hε and δε are the reg-
ularized versions of H and δ0 respectively. The other terms in Eq. (1) denote
Improved C-V Image Segmentation Model Using Delta-Bar-Delta Algorithm 269
the length and area of the curve C [1]. The FCV (c1 , c2 , φ) can be minimized with
respect to φ by solving the gradient flow equation [9].
∂φ ∂FCV (c1 , c2 , φ)
=− (2)
∂t ∂φ
The PDE that needs to be solved to evolve the level set function [1] is given
by:
∂φ 2 2 ∇φ
= δε (φ) −λ1 (u0 − c1 ) + λ2 (u0 − c2 ) + μ.div −ν (3)
∂t |∇φ|
Then the new level set can be obtained by Eq. (4), where step is the length
of the step in the GDS method.
∂FCV (c1 c2 , φ)
φnew = φold − step ∗
∂φ
old
∂φ
= φold + step ∗ (4)
∂t
The C-V model can also be extended to segment vector-valued images [2] and
here also, the level set is evolved by using the basic GDS method.
xk+1 = xk + sk (5)
sk = αk p̂k (6)
The modified GDS method highlighted in our work shows another way in
which the descent direction and length can be optimally calculated for faster
and better convergence than the basic GDS model. Some new methods such
as momentum (MOMENTUM), Resilient Backpropagation (RPROP) [3] and
conjugate gradient (CONJUGATE) [4] have already been proposed which show
significant improvement over the basic GDS algorithm.
270 D. Mandal, A. Chatterjee, and M. Maitra
∂φ
δ̄(n) = (1 − θ) ∗ + θ ∗ δ̄(n − 1) (7)
∂t
where, θ represents the base. The weight update rule employs the conventional
steepest-descent algorithm where each weight is associated with its own learning
rate parameter.
7. Update the level set φ by Eq. (9,10) using the new obtained learning rate
ηij and the momentum ω of the current solution:
∂φ(n)
sn = (1 − ω).η(n) ∗ + ω.sn−1 (9)
∂t
φ(n + 1) = φ(n) + sn (10)
Improved C-V Image Segmentation Model Using Delta-Bar-Delta Algorithm 271
8. The level set may have to be reinitialized locally. This step is optional and,
if employed, is generally repeated after few iterations of the curve evolution.
9. Compare the level set functions (φ(n), φ(n + 1)) . If the solution is not sta-
tionary, then repeat from Step 3, otherwise stop contour evolution and report
the segmentation result.
Fig. 2. The plot of the fitness function with number of iterations n for the image
segmented in Fig. 1
Fig. 4. The plot of the fitness function with number of iterations n for the image
segmented in Fig. 3
Fig. 5. (a)-(e) Grayscale image segmentation and their segmentation contours marked
in red. The DC values, number of iterations and computation time (in seconds) for
each image are: (a) 0.9610, 148, 0.94, (b) 0.9979, 117, 0.61, (c) 0.9995, 144, 0.74, (d)
0.9607, 154, 0.95, and (e) 0.9996, 113, 0.58
Fig. 6. (a)-(e) Colour image segmentation and their segmentation contours marked in
red. The DC values, number of iterations and computation time (in seconds) for each
image are: (a) 0.9995, 168, 1.68, (b) 0.9970, 113, 1.16, (c) 0.9004, 126, 1.19, (d) 0.9652,
225, 2.52, and (e) 0.9644, 239, 2.51
274 D. Mandal, A. Chatterjee, and M. Maitra
Table 1. Performance comparison for the different segmentation methods for the sam-
ple image of Fig. 1
6 Conclusions
The C-V model usually uses the standard GDS method to evolve the active
contour to achieve proper segmentation result. In this work, we have proposed
an advanced adaptation algorithm associated with level sets, named as MDBR
algorithm, a modified version of the Delta-Bar-Delta algorithm, so as to achieve
reduced sensitivity for local optima and higher convergence speed in segmenting
an image. Extensive segmentations of both scalar and vector-valued images show
that our method can consistently achieve lower fitness function value and uses
less adaptation time, compared to other learning algorithms like basic GDS,
RPROP, MOMENTUM and CONJUGATE methods.
References
1. Chan, T.F., Vese, L.A.: Active contours without edges. IEEE Transactions on Image
Processing 10, 266–277 (2001)
2. Chan, T.F., Sandberg, B.Y., Vese, L.A.: Active contours without edges for Vector-
Valued Images. Journal of Visual Communication and Image Representation 11,
130–141 (2000)
3. Andersson, T., Läthén, G., Lenz, R., Borga, M.: Modified Gradient Search for Level
Set Based Image Segmentation. IEEE Transactions on Image Processing 22, 621–630
(2013)
4. Jian-jian, Q., Shi-hui, Y., Ya-Xin, P.: Conjugate gradient algorithm for Chan-Vese
model. Communication on Applied Mathematics and Computation 27, 469–477 (2013)
5. Jacobs, R.A.: Increased rates of convergence through learning rate adaptation. Neu-
ral Networks 1, 295–307 (1988)
6. Dice, L.R.: Measures of the Amount of Ecologic Association Between Species. Ecol-
ogy 26, 297–302 (1945)
7. Osher, S., Sethian, J.A.: Fronts propagating with curvature-dependent speed:
Algorithms based on Hamilton-Jacobi formulations. Journal of Computational
Physics 79, 12–49 (1988)
8. Kass, M., Witkin, A., Terzopoulos, D.: Snakes: Active contour models. International
Journal of Computer Vision 1, 321–331 (1988)
9. Li, C., Huang, R., Ding, Z., Gatenby, J.C., Metaxas, D.N.: A Level Set Method for
Image Segmentation in the Presence of Intensity Inhomogeneities With Application
to MRI. IEEE Transactions on Image Processing 20, 2007–2016 (2011)
Online Template Matching Using Fuzzy Moment
Descriptor
Arup Kumar Sadhu1, Pratyusha Das1, Amit Konar1, and Ramadoss Janarthanan2
1
Electronics &Telecommunication Engineering Deptment,
Jadavpur University, Kolkata, India
2
Computer Science & Engineering Deptment,
TJS Engineering College,
Chennai, India
{arup.kaajal,pratyushargj}@gmail.com,
[email protected], [email protected]
Abstract. In this paper a real-time template matching algorithm has been de-
veloped using Fuzzy (Type-1 Fuzzy Logic) approach. The Fuzzy membership-
distance products, called Fuzzy moment descriptors are estimated using three
common image features, namely edge, shade and mixed range. Fuzzy moment
description matching is used instead of existing matching algorithms to reduce
real-time template matching time. In the proposed matching technique template
matching is done invariant to size, rotation and color of the image. For real time
application the same algorithm is applied on an Arduino based mobile robot
having wireless camera. Camera fetches frames online and sends them to a re-
mote computer for template matching with already stored template in the data-
base using MATLAB. The remote computer sends computed steering and mo-
tor signals to the mobile robot wirelessly, to maintain mobility of the robot. As
a result, the mobile robot follows a particular object using proposed template
matching algorithm in real time.
1 Introduction
Image matching belongs to computer vision and it is the fundamental requirement for
all intelligent vision process [1]. Examples of image matching systems are vision-
based autonomous navigation [2], automated image registration [3], object recogni-
tion [4], image database retrieval [2], 3D scene reconstruction [2]. Technically, the
image matching process generally consists of three components Feature detection,
Feature description, feature selection and optimal matching.
Feature detection [6] helps to detect an object by making local decisions at every
image point. The local decision is about whether there exists a given image feature of a
given type at that point or not. Lately, local features (edges, shades and mixed ranges)
have been extensively employed due to their uniqueness and capability to handle noisy
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 275
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_33, © Springer International Publishing Switzerland 2014
276 A.K. Sadhu et al.
images [6]. Feature description [6] represents the detected features into a compact,
robust and stable structure for image matching. Feature selection [6] selects the best set
of features for the particular problem from the pool of features [6].In Optimal matching
[6] similarity between the features detected in the sensed image and those detected in
the reference image is established. Local feature based approaches typically produces a
large number of features needed to match [2], [7]. Such methods have very high time
complexity and are not suitable for real-time application.
To implement the algorithm on real robots, an Arduino [8] development board is
used to develop a mobile robot. Arduino assists to execute steering and motor control
signals generated on the remote computer by Matlab, after locating the target tem-
plate. Control signals are communicated wirelessly (using XBeemodule [8]) from a
remote computer to the mobile robot. To locate the target template a dedicated wire-
less camera is mounted on the roof of the mobile robot. The Camera sends video in-
formation to the remote computer using a Bluetooth connection. Now, on the remote
computer using Matlab template position is found using template matching algorithm
and then steering and motor control signals are sent to the mobile robot using XBee.
In this paper a real-time application of a template matching algorithm has been
shown on a real mobile robot is shown. Template matching, steering and motor con-
trol signals are generated as explained in the previous paragraph. Initially the grabbed
RGB image from wireless camera is converted to gray image in a remote computer.
The gray image has been partitioned into several non-overlapped blocks of equal
dimensions as per our requirement. Each block containing three regions of possible
characteristics, namely, ‘edge’, ‘shade’ and ‘mixed range’ [1] are identified. The sub-
classes of edges based on their slopes in a given block are also estimated. The degree
of fuzzy membership of a given block to contain the edges of typical sub-classes,
shades and mixed range is measured subsequently with the help of a few pre-
estimated image parameters like average gradient, variance and the difference of the
maximum and the minimum of gradients. Fuzzy moment [1], which is formally
known as the membership-distance product of a block b[i,w], with respect to a block
b[j,k], is computed for all 1≤i, w, j, k≤n. A feature called ‘sum of fuzzy moments [1]’
that keeps track of the image characteristics and their relative distances is used as an
image descriptor. The descriptors of an image are compared subsequently with the
same ones of the test image. Euclidean distance is used as a measure to determine the
distance between the image descriptors of two images. For checking that the test im-
age is matched or not we set a threshold and check the Euclidean distance of the im-
age description of the reference image with all test images is greater or less than the
threshold value. If matched, according to the position of the test image the algorithm
sends the steering signal to the motors.
The rest of the paper is separated into the following sections. Section 2 describes
the required tools. Section 3 gives a brief explanation of the fuzzy template matching
algorithm. Section 4 discusses about the experimental results and Conclusion and
future work are proposed in Section 5.
Online Template Matching Using Fuzzy Moment Descriptor 277
2 Required Tools
The proposed methodology needs two basic tools. One is the mobile robot equipped
with a wireless camera. Another one is the Fuzzy Moment Descriptor (FMD)[2]. Both
of them are discussed in details below.
In this section a brief introduction about the mobile robot developed on the Arduino
platform [8] is given below. This mobile robot is developed at Artificial Intelligence
Laboratory, Jadavpur University by our research group. The mobility of the robot is
maintained by a DC motor. Steering system (Ackerman steering [5]) is controlled by
a DC servomotor. Also, it comprises of two ultrasonic sensors [8] to detect obstacle
position, an encoder [8] to measure speed and distance covered by the robot and mag-
netic compass [8] to measure the magnetic field intensity at a point. It can detect sur-
face texture and material using an infrared sensor. A wireless camera is mounted on
the roof of the mobile robot. The camera records video and sends the video to the
remote computer to process using MATLAB for template matching.
In this section template matching using fuzzy moment descriptor [2] is shown. The
definition of ‘edge’, ‘shade’, ‘mixed range’, ‘edge angle’, ‘fuzzy membership distri-
bution’, ‘gradient’, ‘gradient difference’, ‘gradient average’, ‘variance’ are taken from
[6].The definition of Image features such as edge [6], shade [6] and mixed-range [6]
and their membership distribution are used to calculate fuzzy moment distributions.
The variance [6] ( ) of gradient [6] is defined as the arithmetic mean of square of
deviation from mean. It is expressed formally as
=∑ ( ) (1)
Where, G denotes the gradient values of the pixels, and P(G)[1] represents the proba-
bility of the particular gradient G in that block.
Table 1summarizes the list of membership functions used in this paper. The member-
ship values of a block b[j,k] containing edge, shade and mixed-range can be easily
estimated if the parameters and the membership curves are known. The constant
η , ρ , θ , φ , α , β , λ , δ , c, d , e, f , a and b are estimated using Artificial Bee Colony
Optimization Algorithm [9].
278 A.K. Sadhu et al.
σ2 c x2
1 − e −bx e − ax
2
(d + e x 2 + f x 3 )
b>0 a >> 0
c, d , e, f > 0
The evaluated values of the constants to calculate membership of edge, shade and
mixed-range with respect to Variance (σ 2 ) are a=3, b=0.618, c=0.9649, d=0.1576,
e=0.9706, f=0.9572.
The fuzzy production rules, described below, are subsequently used to estimate the
degree of membership of a block b[j,k] to contain edge (shade or mixed range) by
taking into account the effect of all the three parameters together.
3 Tracking Algorithm
In this section the real-time tracking algorithm is shown. The average time to track the
test object is 5.36secin Intel(R) Core(TM) i7-3770 CPU with clock speed of 3.40GHz
using Matlab R2012b. The time complexity of the image matching algorithm is calcu-
lated as per [1]. However, for online object tracking purpose, propagation delays are
added to the time complexity of fuzzy image matching algorithm. Following this al-
gorithm mobile robot tracks the target image. The real-time template matching algo-
rithm is shown below.
Algorithm
Input: Image of the object to be tracked, select a
threshold ε → 0.
Output: Steering and motor signals.
Initialization:
1.Convert the reference image from rgb to gray
2.Calculate the size (MXN) of the reference image.
3.Calculate the gradient magnitude and gradient
Direction of the reference image.
4. Calculate three parameters ‘Gavg’,‘Gdiff’,‘variance’
for the blocks using equations given in Table 1.
5. Use fuzzy production rule for a block containing
edge.
6. Calculate the membership of ‘edge’, ‘shade’ with
respect to Gavg’, ‘Gdiff’, ‘variance’
7. Calculate the membership for edge angles for all
blocks.
8. Calculate the fuzzy sum moment for
Gavg’, ‘Gdiff’, ‘variance’using(7) for all blocks.
Begin:
1. Take the test frame from the video signal.
2.Calculate the fuzzy moment descriptor for the
test image usingstep1tostep8 in the initialization.
3. Calculate the Euclidian distance, Dry between
Reference image,r and test image, y using (8).
4.If the Dry < ε
Then reference image is present in the test
image.
Do
Begin
a. Calculate the x-coordinate reference image on
the current frame.
Online Template Matching Using Fuzzy Moment Descriptor 281
In this paper, we have proposed a real time template matching algorithm for a mobile
robot. The image matching algorithm involves the fuzzy moment descriptors which
makes the algorithm much faster and accurate as compared to the existing algorithms.
The algorithm was also implemented in hardware system (here in Arduino base mo-
bile robot).Though the real-time response is not satisfactory but it is efficient in terms
of accuracy.To improve response time the fuzzy moment descriptors along with ker-
nel projection and it can be implemented.
References
1. Biswas, B., Konar, A., Mukherjee, A.K.: Image matching with fuzzy moment descriptors.
Engineering Applications of Artificial Intelligenc 14(1), 43–49 (2001)
2. Konar, A.: Computational Intellingence: Principles, Techniques, and Applications. Springer
(2005)
3. Hel-Or, Y., Hel-Or, H.: Real-time pattern matching using projection kernels. IEEE Trans-
actions onPattern Analysis and Machine Intelligenc 27(9), 1430–1445 (2005)
4. Baudat, G., Anouar, F.: Feature vector selection and projection using kernels. Neurocom-
puting 55(1), 21–38 (2003)
5. Everett, H.R.: Sensors for mobile robots: theory and application. AK Peters, Ltd. (1995)
6. Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison-Wesley (1992)
7. Wang, Q., You, S.: Real-time image matching based on multiple view kernel projection. In:
IEEE Conference on Computer Visio and Pattern Recognition (CVPR 2007). IEEE (2007)
8. [Online] http://www.arduino.cc
9. Basturk. B., Karaboga, D.: An artificial bee colony (ABC) algorithm for numeric function
optimization. In: Proceedings of the IEEE Swarm Intelligence Symposium (2006)
Classification of High Resolution Satellite Images Using
Equivariant Robust Independent Component Analysis
1 Introduction
Satellite Images play a vital role in daily life applications such as classified objects,
road network generation and in area of research such as information extraction.
Images are often contaminated by factors such as mixed classes problem from
multispectral sensors i.e. those having similar spectral characteristics. In absence of
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 283
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_34, © Springer International Publishing Switzerland 2014
284 P.P. Singh and R.D. Garg
edge information is become a quite difficult to segregate the objects in images and the
noise occurrence in form of mixed classes also mislead for correct classified region.
Since, the attenuation of noise and preservation of information are usually two con-
tradictory aspects of image processing. Thus, the noise reduction for image enhance-
ment is a necessary step in image processing. Depending upon the spectral resolution
of satellite imagery and the areas of restoration, various noise reduction methods
make several assumptions. Therefore, a single method is not useful for all applica-
tions. Zhang, Cichocki and Amari has been developed the state-space approach to-
wards the problems related to blind source separation and also the recovery of the
original classes by incorporating deconvolution algorithm [2], [6], [14-15]. Nonnega-
tive constraints were imposed to solve the problem of mixed signal. These constraints
have utilized nonnegative factors in sparse matrix form to separate correlated signals
from mixed signals [7]. Least mean square (LMS) type traditional adaptive algorithm
utilized activation function during the learning process. Blind least mean square
(BLMS) is the other version of the LMS algorithm with the calculated blind error,
which has to be tolerating on improving the performance of BLMS [5].
To extract the roof of buildings automatic or semi-automatic, an image processing
algorithm with artificial intelligence methods were proposed [3], [11]. Li et al. (2010)
proposed an approach for automatic building extraction on considering a snake model
with the improved region growing and mutual information [10]. All smaller details
(e.g. chimneys) which hinder proper segmentation were removed from each separate
multispectral band by applying a morphological opening with reconstruction, fol-
lowed by a morphological closing with reconstruction [13].In 2013, a hybrid approach
is proposed for classification of high resolution satellite imagery. It is comprised of
improved marker-controlled watershed transforms and a nonlinear derivative method
to derive an impervious surface from emerging urban areas [12].
While, the above equation depends only on the outputs and its solution will go
ahead to an equivalent approximate of matrix G [1].
Where ψis defined as an estimation of the true Hessian matrix in the region of the
partition. This Hessian approximation is proposed as follows,
This shows the difference only in the diagonal terms of the true Hessian matrix.
Furthermore, on the separation of the classes, the aforesaid difference will become
negligible, due to the Hessian approximation. It keeps true hessian for the Eigen val-
ues with a single module. The true hessian plays an important role to avoid the pos-
sibility of converging towards non-separating solutions. On Substituting (4) into the
iteration (3) and then the following expression in the following algorithm:
Gr ( m +1) = Gr ( m ) − μ ( m ) (C 1,x ,βx S xβ − I )Gr ( m ) , (4)
the separating system is obtained as in the following Eq. (6),
Bs ( m +1) = Bs ( m ) − μ ( m ) (C 1,x ,βx S xβ − I ) Bs ( m ) , (5)
This expression denoted as the CII (Cumulant-based iterative inversion) term,
which is significantly also known as Quasi-Newton method, due to the recursion
property. The saddle point can be found by the results of the comprehensive and cu-
mulant-based iterative inversion (GCII) algorithm,
Bs( m+1) = Bs( m ) − μ ( m ) wβ Cx1+, xβ S xβ − I Bs( m) , (6)
β ∈Ω
Where, ( ) signifies the optimal step-size value to achieve the separation of
classes in the manner of local convergence and ∑ ∈Ω =1
Therefore, the advanced version of the algorithm utilizes many cumulants matrices
to make it extra robust in the sense of deducting the probability, which emphasizes
mainly on a specific cumulant. These deductions come due to the occurrence of a few
bad choices of cumulant sequence whose outcomes in near zero values of the
weighted sum of cumulants for few sources. Additionally, the statistical information
286 P.P. Singh and R.D. Garg
utilizes in best way by using many cumulants matrices in the proposed algorithm.
However, the variance of the cumulant estimation is inversely proportional to the
selected weighting factor.
The proposed ERICA approach is used to classify the existing objects from the HRSI
on resolving the mixed class problem with preserving mutual exclusion among classes.
The mandatory input argument is a HRSI image as converted in Matrix form ‘x’, which
is observed as in the dimension (sources x, samples s). Each column and row of this
matrix corresponds to a different spectral resolution and pixel respectively. The other
optional input arguments are the number of independent components (e), which has to
extract (default e=1) simultaneously vector with the weighting coefficients (w). This
iterative process involvesa sensitivity parameter to stop the convergence.
Fig. 1. A Proposed Framework for satellite image classification using ERICA method
Fig.1 shows a proposed Framework for satellite image classification using ERICA
method. Initially a HRSI is taken as input to complete the scanning process in hori-
zontal and vertical direction with regular sampling to scan one by one pixel. Now, the
proposed ERICA approach is asymptotically in nature to converge the desired saddle
points. These saddle points help to approach for separation of classes with the existing
cumulants based method in the proposed approach.Finally,ERICA approach resolves
the mixed class problem onsegregatingthe different objects in satellite image.
Fig. 2. ERICA based classified results of the emerging suburban area (a-b): (I) Input image; (II)
Result of our proposed method
288 P.P. Singh and R.D. Garg
High Resolution
Iteration Convergence Time elapsed (Sec)Performance Index(PI)
Satellite Images (HRSI)
Emerging ES1 1 - 25 0.57563 - 0.00008 15.14 0.0325
suburban areas ES3 1 - 41 0.20248 - 0.00011 23.10 0.0513
Table 2. Kappa coefficient and overall accuracy of classification of high resolution satellite
images
Overall Accu-
Kappa Coefficient (K)
racy (OA)
High Resolution Satellite
Images (HRSI) Biogeography Based
Proposed Proposed Maximum Likelihood
Optimization
ERICA ERICA Classifier (MLC)
(BBO)
Emerging suburban ES1 95.25% 0.8756 0.7746 0.6726
areas ES3 96.27% 0.8856 0.7836 0.6816
Table 2 shows the comparison of the calculated kappa coefficient (KC) and overall
accuracy (OA) of classification of high resolution satellite images among MLC, BBO
and the Proposed ERICA approaches.
Classification of High Resolution Satellite Images Using ERICA 289
The proposed ERICA method shows the better KC value in comparison to MLC
and BBO methods. It shows the number of iteration to converge the algorithm in the
calculated time and finally shows the performance index in calculating time. Table 3
and Table 4 show the producer's accuracy (PA) and user's accuracy (UA) of classifi-
cation of high resolution satellite images respectively.
References
1. Almeida, L.B., Silva, F.M.: Adaptive Decorrelation. In: Aleksander, I., Taylor, J. (eds.)
Artificial Neural Networks, vol. 2, pp. 149–156. Elsevier Science Publishers (1992)
2. Amari, S.: Natural gradient work efficiently in learning. Neural Comput. 10(2), 251–276
(1998)
3. Benediktsson, J.A., Pesaresi, M., Arnason, K.: Classification and feature extraction for re-
mote sensing images from urban areas based on morphological transformations. IEEE
Transactions on Geoscience and Remote Sensing 41(9), 1940–1949 (2003)
4. Cardoso, J.F., Laheld, B.: Equivariant adaptive source separation. IEEE Trans. Signal
Process. 44(12), 3017–3030 (1996)
5. Choi, S., Cichocki, A., Amari, S.: Flexible independent component analysis. Journal of
VLSI Signal Processing 26(1/2), 25–38 (2000)
290 P.P. Singh and R.D. Garg
6. Cichocki, A., Unbehauen, R., Rummert, E.: Robust learning algorithm for blind separation
of signals. Electronics Letters 30(17), 1386–1387 (1994)
7. Cichocki, A., Georgiev, P.: Blind Source Separation Algorithms with Matrix Constraints.
IEICE Trans. on information and Systems, Special Section on Special issue on Blind Sig-
nal Processing E86-A(1), 522–531 (2003)
8. Cruces, S., Castedo, L., Cichocki, A.: Novel Blind Source Separation Algorithms Using
Cumulants. In: IEEE International Conference on Acoustics, Speech, and Signal
Processing, V, Istanbul, Turkey, pp. 3152–3155 (2000)
9. Kelley, C.T.: Iterative methods for linear and nonlinear equations. In: Frontiers in Applied
Mathematics, vol. 16, pp. 71–78. SIAM, Philadelphia (1995)
10. Li, G., Wan, Y., Chen, C.: Automatic building extraction based on region growing, mutual
information match and snake model. In: Zhu, R., Zhang, Y., Liu, B., Liu, C. (eds.) ICICA
2010. CCIS, vol. 106, pp. 476–483. Springer, Heidelberg (2010)
11. Segl, K., Kaufmann, H.: Detection of small objects from high-resolution panchromatic sa-
tellite imagery based on supervised image segmentation. IEEE Transactions on Geoscience
and Remote Sensing 39(9), 2080–2083 (2001)
12. Singh, P.P., Garg, R.D.: A Hybrid approach for Information Extraction from High Resolu-
tion Satellite Imagery. International Journal of Image and Graphics 13(2), 340007(1-16)
(2013)
13. Yang, J.H., Liu, J., Zhong, J.C.: Anisotropic diffusion with morphological reconstruction
and automatic seeded region growing for colour image segmentation. In: Yu, F. (ed.) Pro-
ceedings of the International Symposium on Information Science and Engineering, Shan-
gai, China, pp. 591–595 (2008)
14. Zhang, L., Amari, S., Cichocki, A.: Natural Gradient Approach to Blind Separation of
Over- and Under-complete Mixtures. In: Proc. of Independent Component Analysis and
Signal Separation, Aussois, France, pp. 455–460 (1999)
15. Zhang, L., Amari, S., Cichocki, A.: Equi-convergence Algorithm for blind separation of
sources with arbitrary distributions. In: Mira, J., Prieto, A.G. (eds.) IWANN 2001. LNCS,
vol. 2085, pp. 826–833. Springer, Heidelberg (2001)
3D Face Recognitionacross Pose Extremities
1 Introduction
All registration algorithms attempt to solve the problem of finding a set of transforma-
tion matrices that will align all the data sets into a single coordinate system. In the
problem of 3D face recognition, three dimensional registrations is an important issue.
For 3D face which is oriented across any angle, for it to be correctly recognized, it has
to be correctly registered. The method which has been used in this paper is that, the
algorithm takes as input a 3D face and returns the registered images for poses from 0
to ± 90˚. We need to specify here that, the proposed method discussed is a rough reg-
istration methodology. We have used Hausdorff distance calculation as a metric, to
test the performance of our registration method which has been discussed in Section
2. Here, we are going to discuss about some of the related works based on 3D face
registration across pose, and discuss the advantages of our method over the existing
methods on 3D face registration.In [1], the authors used TPS plane wrapping and
Procrustes analysis for rough alignment with ICP algorithm for registration, but in
reality ICP is an approach which runs correctly only up to angles of 20 to 30 degrees
on different 3D face databases. In [2], the authors have found out the angles which
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 291
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_35, © Springer International Publishing Switzerland 2014
292 P. Bagchi, D. Bhattacharjee, and M. Nasipuri
been used for registration of 3D face images, but the process can only detect pose
angles up to 40˚. In [3], the authors used Hausdorff’s distance as a measure for regis-
tration but no angles of pose corrections were mentioned. In [4], only a part of the
registration work was done i.e. identifying only the nose tip region across different
poses like yaw, roll etc. In [5], facial landmark localization was done using dense
SURF features and also angles ranging from -90˚ to +90˚ were considered. But, the
work composed only of 3 databases namely XM2VTS, the BioID and the Artificial
driver datasets out of which both the XM2VTS and the BioID face database compris-
es of very near frontal faces. In [6], the authors presented a 3D facial recognition al-
gorithm based on the Hausdorff distance metric. But, in addition the authors assumed
that there should be a good initial alignment between probe and template datasets.
The novelty of the present approach, stated in this paper is that, an attempt has been
made to register 3D face scans taken from the Bosphorus databases using Hausdorff’s
distance, for poses ranging from 0 to 90 degrees. The distinctiveness of the present
approach is that, a mathematical model to register a 3D face scan has been set up,
which uses Hausdorff’s distance metric and considers all pose extremities. The paper
has been organized as follows: In Section 2, a description of the proposed method has
been described. Our significant contribution and a comparative analysis been dis-
cussed in Section 3. Experimental results are discussed in Section 4, and finally in the
Section 5, conclusion and future scope have been enlisted.
Choose the basic landmark model against whom the images are to be registered
3D Face Recognition
“Select the 3D-Database Directory”, the user is prompted to select the directory where
the *.bnt files reside.
Choose the basic landmark model against whom the images are to be registered:-
In this example, when the user clicks on the second button, then he is prompted to
choose the basic landmark model as shown in Fig.2(b), against whom the 3D face
model as selected in Step 2.1 and shown in Fig.2(a), isto be registered. The figure
shown in Fig. 2(a) is called a 2.5D range because, it contains depth information at
each of it’s pixel position. The preprocessing stages which have been used to reduce
noise in the 3D face images are subdivided into two parts as follows:-
Fig. 2a. A 2.5D range image rotated Fig. 2b. Registered image corresponding
about 45° about y-axis to Fig. 2a)
2.1.1 Cropping
Fig.3 shows a 2.5D range image of a frontal pose taken from the Bosphorus database
after being cropped by fitting an ellipse to it.
(a) (b)
Fig. 4. Frontal range images from Bosphorus (a, b) after smoothing by weighted median filter
Fig. 5. (a) Frontal pose (b) Rotated about y-axis (c) Rotated about x-axisfrom the Bosphorus
database
The images that the user chooses are cropped and smoothed by weighted median
filter, the same approach used for the frontal images. After the images are smoothed
the final mesh images looks like that shown in Fig.6.
A face recognition system is generally made up of two key parts: - registration and
comparison. The accuracy of registration will greatly impact the result of the compar-
ison. The Hausdorff distance was originally designed to match binary edge images
[8][10]. Modifications to the Hausdorff distance permit it to handle noisy edge posi-
tions, missing edges from registration, and spurious edges from clutter noise [8][9].
Given two sets of facial range data S = {s1, s2,…….sk}and M = {m1, m2,……., mn},
the task of 3D image registration is to find the transformation i.e. translation, rotation
and scaling which will optimally align the regions of S with those of M. For a trans-
formation group G, it can be formalized as an optimization problem:
Input:
________________________________________________________
Rot← Rotation matrix i.e. across yaw, pitch or roll
S←3D mesh of unregistered image
R← 3D mesh of frontal image
counter← 1
Output:-S←3D mesh of registered image
Do
1.if counter equals 1 then
2.dist= Call_hausdorff(S, R);
3.else
4. S ← Rotate and translate S by the matrix Rot
5. Update_dist = Call_hausdorff(S, R);
6. if((Update_dist-dist)/dist>=0.2 || (Update_dist-dist) /dist>=0.1)
7. break;
8.end if
9. counter=counter+1;
while(1)
10.Display the final registered 3D image S
11. End of Algorithm
_______________________________________________
Sub Function [dist] = Call_hausdorff (A, B)
dist = max (calculate (A, B),calculate (B A))
End Function
________________________________________________
Sub Function [dist] = calculate (A, B)
1. for k=1: size(A)
2. D = (A-B).* (A-B);
3. dist = min (dist);
4. end for
5. dist =max (dist);
End Function
________________________________________________________
As is quite evident from the Algorithm 1, the present proposed method takes as input
a 3D range image oriented across any pose from 0° to 90°, rotates and translates it and
the process is continued till the difference between the original 3D mesh image and the
registered mesh image crosses a maximum threshold of 0.1. We have fixed the thre-
shold value 0.1 or 0.2, by making a rough estimate on the basis of the database used in
the method. The function Call_Hausdorff()[11] calculates the maximum distance be-
tween the two point clouds. To calculate the percentage rotational error between the
neutral image in frontal pose and the registered image, we calculate the RMS(Root
Mean Square) error between the frontal and the registered images. Variable counter is
used to check the decision control for initial and final conditional statements.
296 P. Bagchi, D. Bhattacharjee, and M. Nasipuri
Input:
________________________________________________________
S←3D mesh of registered image
R← 3D mesh of frontal image
________________________________________________________
1.Find the coordinates of the nose-tip by depth map analysis[2]of R and store
in xR and yR.
2.Find the coordinates of the nose-tip by depth map analysis[2]of S and
store in xS and yS.
3. percen_errorx= abs(xS/xR)
4. percen_errory= abs(yS/yR)
5.Check if (percen_errorx<0.01 &&percen_errory<0.01)
6. The 3D image is perfectly registered
7. Else
8. The 3D image is not perfectly registered
9. End if
________________________________________________________
4. Finally, the recognition of the registered images by normal points has also proved,
to be very much effective.
4 Experimental Results
The experimental results of the registration performed by the proposed algorithm have
been tested on the Bosphorus database (Refer Table 2).The experimental setup consis-
tedof apersonal machine with a system configuration of Intel Core2Duo Processor.
The speed of the processor was 2.20GHz with 1GB RAM and 250GB hard disk.
All experimentation was done in Matlab R2011b version.
298 P. Bagchi, D. Bhattacharjee, and M. Nasipuri
The 3D registered images are now fed to the recognition system. Classification was
performed on the extracted normal features, taken from the reconstructed face images
using MultiLayer Perceptron.
As is inevitable from the above Table 4, the PCA based approach works well. The
rank-2 recognition rate, for the current proposed method was 93.3%, when we used
approach 2, as shown inTable 4. The rank-2 recognition rate is better than rank-1 (as
shown inTable 3) because face normals are less prone to pose variations.
pose i.e. from 0˚ to ± 90˚.The paper tries to address, the problem of pose with respect
to 3D face images and tries to find out a solution for it. Much of the work which has
been done based on Hausdorff distance has been only based on feature detection as
well as correction. Maximum of the state of the art work have first registered the 3D
image models and then corrected the registration using Hausdorff distance. Our ap-
proach is a very different because we have based our registration technique on Haus-
dorff’s distance metric. As part of the future work, a more robust model which would
increase the accuracy of the present registration and subsequently recognition tech-
nique are planned to be implemented across many more 3D face databases.In future,
we attempt to draw up a more comprehensive analysis of the results of registration as
well as recognition on other databases also.
Acknowledgement. The work has been supported by the grant from Department of
Electronics and Information Technology (DEITY), Ministry of Communication &
Information Technology (MCIT),Government of India.
References
1. Mian, A., Bennamoun, M., Owens, R.: Automatic 3D Detection, Normalization and Rec-
ognition. Clarendon, Oxford (1892)
2. Pan, G., Wu, Z., Pan, Y.: Automatic 3D Face Verification From Range Data. In: Proceed-
ings of IEEE Conference on Acoustics, Speech, and Signal Processing, pp. 193–196
(2003)
3. Mian, A.S., Bennamoun, M., Owens, R.: An Efficient Multimodal 2D-3D Hybrid Ap-
proach to Automatic Face Recognition. IEEE Transactions on Pattern Analysis and Ma-
chine Intelligence 29(11), 1927–1943 (2007)
4. Anuar, L.H., Mashohor, S., Mokhtar, M., Adnan, W.A.W.: Nose Tip Region Detection in
3D Facial Model across Large Pose Variation and Facial Expression. International Journal
of Computer Science Issues 7(4), 4 (2010)
5. Sangineto, E.: Pose and Expression Independent Facial Landmark Localization Using
Dense-SURF and the Hausdorff Distance. IEEE Transactions on Pattern and Machine In-
telligence 35(3), 624–638 (2013)
6. Russ, T.D., Koch, M.W., Little, C.Q.: A 2D Range Hausdorff Approach for 3D Face Rec-
ognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recog-
nition Workshops (2005)
7. Aspert, N., Cruz, D., Ebrahimi, T.: Mesh:measuring errors between surfaces using the
hausdorff distance. In: IEEE Conference in Multimedia and Expo, pp. 705–708 (2002)
8. Radvar-Esfahlan, H., Tahan, S.: Nonrigid geometric metrology using generalized numeri-
cal inspection fixtures. Precision Engineering 36(1), 1–9 (2012)
9. Alyuz, N., Gokberk, B., Akarun, L.: Adaptive Registration for Registration Based 3D Reg-
istration. In: BeFIT 2012 Worksop (2012)
10. Liu, P., Wang, Y., Huang, D., Zhang, Z., Chen, L.: Learning the Spherical Harmonic Fea-
tures for 3D Face Recognition. IEEE Transactions on Image Processing 22(3), 914–925
(2013)
11. Dubuisson, M.P., Jain, A.K.: A Modified Hausdorff Distance for Object Matching. In: In-
ternational Conference on Pattern Recognition, pp. 566–568 (1994)
12. Alyuz, N., Gokberk, B., Akarun, L.: Adaptive Registration for Registration Based 3D Reg-
istration. In: BeFIT 2012 Worksop (2012)
Indexing Video Database for a CBVCD System
1 Introduction
With the development of different multimedia tools and devices there has been
enormous growth in the volume of digital video data. Search and retrieval of desired
piece of video data from a large database has become an important area of research. In
the applications like content based video retrieval (CBVR) or content based video
copy detection (CBVCD) system a query video data is given by the user. It is to be
matched with the video data stored in the database. In case of a CBVR system, we are
mostly interested in finding few top order matches. But in case of CBVCD detection,
the task is to decide whether or not the query is a copied version of any reference
video data stored in the database. Further challenge for a CBVCD system is that a
copied video data may undergo different photometric and post-production
transformations making it different from the corresponding reference video. In both
the applications, exhaustive video sequence matching is prohibitive. As a result
indexing scheme becomes essential. In this work, we propose an indexing scheme and
also present its application for a CBVCD system.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 301
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_36, © Springer International Publishing Switzerland 2014
302 D. Dutta, S.K. Saha, and B. Chanda
The paper is organized as follows. The brief introduction is followed by the review
of past work on indexing in Section 2. Details of the proposed methodology are
presented in Section 3. Experimental results and concluding remarks are put in
Section 4 and Section 5 respectively.
2 Past Work
A video database can be indexed following different approaches outlined in the work
of Brunelli et al. [1]. Video data may be annotated manually at various level of
abstraction. Abstraction may range from the video title to content details. The
database may be organized according to the annotations to support indexing. Such
manual annotation is labour intensive and subjective. To overcome the difficulties
there was the development of alternate approach in the form of content-based
automated indexing. A framework for automated video indexing as discussed in [1] is
to segment the video data into shots and to identify the representative frames from the
shots. Subsequently, the collection of those representatives is considered as the image
database and experience of content based indexing of image database can be applied.
Similar trend in video browsing and retrieval still persists as indicated in [2].
Zhang et al. [3], in their approach have classified the video shots into groups
following a hierarchical partition clustering. Bertini et al. [4] have presented a
browsing system for news video database. A temporal video management system has
been presented in [5] that relies on tree based indexing. Spatio-temporal information
data has been widely used for video retrieval [6,7]. Ren and Singh [8] proposed R-
string representation to formulate spatio-temporal data into binary string. Non-
parametric motion analysis has also been used for video indexing [9]. But, presence of
noise and occlusion may lead to failure [10].
As discussed in [1], [4], the major trend for video indexing is to break it into units
and to map the video database into image database by taking the representative frames
of the segmented units. So it is worth to review the techniques for indexing the image
database. A comprehensive study on high dimensional indexing for image retrieval
has been presented in [11]. Tree based schemes are widely used. Zhou et al. [12] have
classified those according to the indexing structure. K-D tree [13], M tree and its
variants [14], TV tree [15], are few examples of such indexing structures. Hashing
based techniques [16], [17] are also quite common for indexing an image database.
Concept of bag of words has been deployed in number of works [18], [19].
Most of the video indexing system has focused on specific domain and thereby
concentrated on designing the descriptors accordingly. Details of the underlying
indexing structure have not been elaborated. It appears that the technique used for
image database is adopted to cater the core need of indexing the video database.
3 Proposed Methodology
In our early work [20], a content-based video copy detection (CBVCD) system
has been proposed. The system is robust enough against various photometric and
Indexing Video Database for a CBVCD System 303
post-production attacks. Each frame in the video data goes through preprocessing to
reduce the effect of attacks and then features are extracted. In order to decide whether
a shot in the query video is a copied version of any of the reference video shots or not,
an exhaustive video sequence matching is carried out following multivariate Wald-
Wolfowitz hypothesis test. Such linear search leads to large number of test which is
prohibitive. In this effort, our target is to propose a methodology to reduce the number
of video sequence matching.
As presented in Section 2, common approach for video indexing is to map the
problem to image database indexing. The major steps are like breaking the video data
into structural units called shots, extraction of representative frames of the shots to
form an image database and finally well established image database indexing
technique is applied on the image database formed. Proposed methodology also
follows the same approach.
In this work, it is assumed that the video data is already segmented into shots
following the technique presented in [21]. A clustering based shot level representative
frame detection scheme is devised and subsequently indexing is done based on
triangle inequality property [22].
Each frame in the shot is first represented by an edge based visual descriptor. A
frame is divided into a fixed number of blocks. Normalized count of edge pixels in
the blocks arranged in raster scan order forms the multi-dimensional feature vector.
We have partitioned the frame into 16 blocks and same is the dimension of the vector.
The features thus computed are neither too local nor global.
In order to verify the uniformity of a shot content, similarity of each frame with
respect to the first one in the shot is computed. Let, si be the similarity value for the i-
th frame in the shot. If min{si}/max{si} is smaller than a threshold then the shot is
taken as non-uniform one and it qualifies for multiple representatives. Similarity
304 D. Dutta, S.K. Saha, and B. Chanda
Indexing scheme is applied on the image database obtained after collecting the
representative frames of all the shots in the video sequences. Based on the triangle
inequality approach, a value for each database image is assigned which corresponds to
the lower bound on the distance between database image and a query image. On that
value, a threshold is applied to discard the database images which are away from the
query image.
Let I = {i1, i2, . . . , in} and K = {k1, k2, . . . , km} denote the database of images and
collection of key images chosen from I respectively. As per triangle inequality,
dist(ip,Q) + dist(Q, kj) ≥ dist(ip, kj) is true where ip ϵ I, kj ϵ K, Q is a query image and
dist(..) stands for a distance measure. The equation can be rewritten as dist(ip,Q) ≥ |
dist(ip, kj) − dist(Q, kj) |. Considering all kj ϵ K, the lower bound on dist(ip,Q) can be
obtained from dist(ip,Q) ≥ maxj(| dist(ip, kj) − dist(Q, kj) |) where j ϵ {1, 2, . . . ,m}.
Thus, the major steps are as follows.
– Select the key images K from the image database I
– pre-compute the distance matrix to store dist(ip, kj) for all ip ϵ I w.r.t all kj ϵ K
All these steps are offline. In order to select the key images, images in the database
are partitioned into clusters following k-means clustering algorithm and number of
optimal clusters is decided based on Dunn-index. For each cluster, image nearest to
the cluster centre is taken as key image. In order to prepare the distance matrix, n ×m
of computation is required. In our experiment, dist(ip, kj) = 1 − bhatt_dist(ip, kj) where
bhatt_dist(ip, kj) provides the similarity between ip and kj .
For a CBVCD system, our point of interest is to find out the images from the
database for which the lower bound does not exceed a threshold t. For searching
against a query image(Q) the steps to be carried out in online mode are as follows.
based on the actual implementation. In our work, with respect to each kj a separate
structure stores dist(ip, kj) in an order. It enables binary search to retrieve the ips
satisfying the condition dist(Q, kj) - t ≤ dist(ip, kj) ≤ dist(Q, kj) + t. Thus, the overhead
of accessing the index is also reduced significantly. In our experiment value of t is
empirically determined and taken as 0.01.
Now, for video copy detection, representative frames are extracted from the shot to
be tested. Each such representative frame is used as the query image (Q) to retrieve
the desired images from the database. With the shots corresponding to all the
retrieved images, the final hypothesis test is carried out to decide the outcome. Thus
the final test is carried out with only a small subset of the shots in the reference
database making the CBVCD system faster.
4 Experimental Result
In order to carry out the experiment we have worked with a collection of video data
taken mostly from TRECVID 2001 dataset and a few other recordings of news and
sports program. The sequences are segmented into shots following the methodology
presented in [21]. The dataset contains 560 shots. Performances of the proposed
methodology to extract the representative frames of the shot and effectiveness of
indexing are studied through experiments.
All the shots are manually ground-truthed and marked as single or multiple
depending on whether a shot requires a single frame or more than one frame as its
representative. For shots of multiple types, different homogeneous sub-shots are also
noted. Performance of the proposed scheme for extracting the representative frames is
shown in Table 1. It shows that the homogeneous shots are correctly identified as
single and it fails only for a few cases of shots of type multiple. There may be other
type of errors like over-splitting and under-splitting in case the number of extracted
frame is more or less than the expected count respectively. In our experiment, only
five shots are under-splitted extracted frames is one less than the desired count and
over-splitting occurred for three shots.
To study the performance of the indexing system, we have considered the
application of CBVCD. Index based search has been incorporated in the CBVCD
system presented in [20]. In linear method, a query video shot is verified with all the
shots in the reference video database using hypothesis test based sequence matching
technique. But in case of indexed method, following the proposed scheme a subset of
shots whose representative frames match with those of query shot are retrieved. Only
with the retrieved shots final verification is made. Adoption of indexing scheme is
expected to make the process faster but it may affect the detection performance. The
performance of a CBVCD can be measured in terms of correct recognition rate (CR)
and false alarm rate (FR). These are measured as CR=(nc/ na) x 100 % and FR =
(nf/nq)x100%where nc, na, nf , nq are number of correctly detected copies, number of
actual copies, number of false alarm and number of queries respectively.
306 D. Dutta, S.K. Saha, and B. Chanda
Table 2 shows the performance of linear and indexed method for the CBVCD
system. In case of indexed method the sequence matching is carried out in a reduced
search space. As a result, exclusion of similar sequence is possible, particularly for
the transformed version. But, for a CBVCD system similarity is not synonymous with
copy. It is well reflected in Table 2 that reduction of search space has reduced both
CR and FR. But, CR is reduced marginally and FR (it is more sensitive and should
have low value) is reduced significantly. Even under different photometric (change in
brightness, contrast, corruption by noise) and post-production (insertion of logo,
change in display format – flat file, letter box, pillar) attacks, the indexed method
reduces FR substantially without much compromising the correct recognition rate.
As the indexing scheme reduces the search space, the copy detection process as a
whole becomes faster. Experiment with 1760 query shots has revealed that on an
average the system becomes five times faster.
5 Conclusion
In this work a simple but novel video indexing scheme is presented. A clustering
based scheme is proposed which dynamically determines the number of
Indexing Video Database for a CBVCD System 307
representative frames and extracts them. A triangle inequality based indexing scheme
is adopted for the image database formed by collecting the representative frames for
all the shots. For a shot given as the query, candidate shots are retrieved based on the
proposed methodology. On the retrieved candidate shots, video sequence matching
technique can be applied to fulfill the requirement of a CBVCD system. Experiment
indicates the effectiveness of the proposed methodology in extracting the
representative frames. Applicability of the indexing scheme in CBVCD system is also
well established as it reduces false alarm rate drastically without making much
compromise on correct recognition rate and it speeds up the process significantly.
References
1. Brunelli, R., Mich, O., Moden, C.M.: A survey on the automatic indexing of video data.
Journal of Visual Communication and Image Representation 10, 78–112 (1999)
2. Smeaton, A.F.: Techniques used and open challenges to the analysis, indexing and
retrieval of digital video. Information Systems 32, 545–559 (2007)
3. Zhang, H.J., Wu, J., Zhong, D., Smoliar, S.W.: An integrated system for content based
video retrieval and browsing. Pattern Recognition 30(4), 643–658 (1997)
4. Bertini, M., Bimbo, A.D., Pala, P.: Indexing for reuse of tv news shot. Pattern
Recognition 35, 581–591 (2002)
5. Li, J.Z., OZsu, M.T., Szafron, D.: Modeling video temporal relationships in an object
database systems. In: Proc. SPIE Multimedia Computing and Networking, pp. 80–91
(1997)
6. Pingali, G., Opalach, A., Jean, Y., Carlbom, I.: Instantly indexed multimedia databases of
real world events. IEEE Trans. on Multimedia 4(2), 269–282 (2002)
7. Ren, W., Singh, S., Singh, M., Zhu, Y.S.: State-of-the on spatio-temporal information-
based video retrieval. Pattern Recognition 42 (2009)
8. Ren, W., Singh, S.: Video sequence matching with spatio-temporal constraint. In: Intl.
Conf. Pattern Recog., pp. 834–837 (2004)
9. Fablet, R., Bouthmey, P.: Motion recognition using spatio-temporal random walks in
sequence of 2d motion-related measurements. In: Proc. Intl. Conf. on Image Processing,
pp. 652–655 (2001)
10. Fleuret, F., Berclaz, J., Fua, P.: Multicamera people tracking with a probabilistic
occupancy map. IEEE Trans. on PAMI 20(2), 267–282 (2008)
11. Fu Ai, L., Qing Yu, J., Feng He, Y., Guan, T.: High-dimensional indexing technologies for
large scale content-based image retrieval: A review. Journal of Zhejiang University-
SCIENCE C (Computers & Electronics) 14(7), 505–520 (2013)
12. Zhou, L.: Research on local features aggregating and indexing algorithm in large-scale
image retrieval. Master Thesis, Huazhong University of Science and Technology, China
10–15 (2011)
13. Robinson, T.J.: The k-d-b tree: A search structure for large multidimensional dynamic
indexes. In: Proc. ACM SIGMOD Intl. Conf. on Management of Data, pp. 10–18 (1981)
14. Skopal, T., Lokoc, J.: New dynamic construction techniques for m-tree. Journal of Discrete
Algorithm 7(1), 62–77 (2009)
15. Lin, K.I., Jagadish, H.V., Faloutsos, C.: The tv-tree: An index structure for high-
dimensional data. VLDB Journal 3(4), 517–542 (1994)
308 D. Dutta, S.K. Saha, and B. Chanda
16. Zhuang, Y., Liu, Y., Wu, F., Zhang, Y., Shao, J.: Hypergraph spectral hashing for
similarity search of social image. In: Proc. ACM Int. Conf. on Multimedia, pp. 1457–1460
(2011)
17. Heo, J.P., Lee, Y., He, J., Chang, S.F., Yoon, S.E.: Spherical hashing. In: Proceedings of
IEEE Conference on Computer Vision and Pattern Recognition, pp. 2957–2964 (2012)
18. Avrithis, Y., Kalantidis, Y.: Approximate gaussian mixtures for large scale vocabularies.
In: Proc. European Conf. on Computer Vision, pp. 15–28 (2012)
19. Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE
Trans. PAMI 33(1), 117–128 (2011)
20. Dutta, D., Saha, S.K., Chanda, B.: An attack invariant scheme for content-based video
copy detection. Signal Image and Video Processing 7(4), 665–677 (2013)
21. Mohanta, P.P., Saha, S.K., Chanda, B.: A model-based shot boundary detection technique
using frame transition parameters. IEEE Trans. on Multimedia 14(1), 223–233 (2012)
22. Berman, A.P., Shapiro, L.G.: A flexible image database system for content-based retrieval.
Computer Vision and Image Understanding 75(1/2), 175–195 (1999)
23. Ciocca, G., Schettini, R.: An innovative algorithm for key frame extraction in video
summarization. Real Time IP 1, 69–98 (2006)
24. Mohanta, P.P., Saha, S.K., Chanda, B.: A novel technique for size constrained video
storyboard generation using statistical run test and spanning tree. Int. J. Image
Graphics 13(1) (2013)
25. Dunn, J.C.: Well separated clusters and optimal fuzzy partitions. Journal of Cybernetica 4,
95–104 (1974)
Data Dependencies and Normalization of Intuitionistic
Fuzzy Databases
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 309
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_37, © Springer International Publishing Switzerland 2014
310 A.R. Shora and A. Alam
1.1 An Example
Table 1. Symptoms of a disease quantified using IF database [5] (MV, NMV, HESITATION)
Based on these values, a decision can be formed as to which extent has the disease
spread and what treatment should be given to the patient. The IF- DSS [5] works on
the linguistic variables which are quantified and represented through IF databases.
Formally, an Intuitionistic Fuzzy set can be defined as follows: Let a set E be fixed.
An IFS A in E is an object of the following form: A = {(x, μA(x), νA(x)) x ∈ E}, When
νA(x) = 1 − μA (x) for all x ∈ E is ordinary fuzzy set. In addition, for each IFS A in E,
πA(x) = 1− μx − νx, where πA (x) is called the degree of indeterminacy of x to A, or
called the degree of hesitancy of x to A. The following figure shows a gradual curve
for the membership function of an Intuitionistic Fuzzy set and a discrete yes/no logic.
1.5
1 Intuitionistic
fuzzy sets
0.5
crisp sets
0
a b c d
Fig. 1. Crisp sets vs Intuitionistic Fuzzy sets: This figure represents a smooth transition from
membership to non-membership degrees
An Intuitionistic fuzzy database [7] is a set of relations where each pair of such
relation R is a subset of the cross product: 2D1 × 2D2 × ………2Dm, where 2Di = P
(Di) and P (Di) is the power set of Di, here R is called the Intuitionistic Fuzzy
database relation.
Data Dependencies and Normalization of Intuitionistic Fuzzy Databases 311
An Intuitionistic fuzzy set [4], [18] is often considered a generalization of fuzzy set
[6].Suppose, we have an Intuitionistic Fuzzy set,S1 {(4, 0.5, 0.3), (5, 0.8, 0.1), (6, 0.7,
0.2)} and a Fuzzy set, S2 {(4, 0.5), (5, 0.8), (6, 0.7)}. S1 and S2 represent the same
information but differently. Set S1 denotes that element ‘4’ belongs to S1 to a degree
of membership, 0.5 and the degree of non membership, 0.3. In S2, a part of this
information is lost. It only denotes that the degree of membership is 0.5 and the fact
that the possibility of not belonging (to S2 is 0.3) is lost. The degree of non
membership i.e. the possibility of non occurrence of an event is equally important as
the possibility of occurrence of events in our daily decision making process.
Since, both fuzzy and Intuitionistic Fuzzy logic simulate human logic closely, there
are evidences that support the fact that Intuitionistic Fuzzy logic is even more closer
to human decision making logic [12], [19].
3 Normalization
4 Related Work
The research [20] on Fuzzy and Intuitionistic Fuzzy sets has gained a lot of
importance over the past few years. Fuzzy and Intuitionistic Fuzzy DBMS has also
been in focus [7], [16], [21], [22]. Database schema design is the most important thing
to be considered when it comes to RDBMS and so is the case with Fuzzy/ IF
RDBMS. Work done on Fuzzy dependencies [10], [11] has proved to be helpful for
exploring Fuzzy normalization techniques. An alternative approach [17] to define
Functional dependencies has also been proposed for handling transitive dependencies.
The need for fuzzy dependency preservation led to the need for decomposing Fuzzy
databases [14]. Similarly, Intuitionistic Fuzzy relations and dependencies have been
studied in detail [13], [15], [16]. As a result, Intuitionistic Fuzzy normal forms were
defined [1], [2], [3] on the basis of dependencies among data. These dependencies are
also Intuitionistic Fuzzy in nature as they have a grade of membership and non-
membership associated with them.
312 A.R. Shora and A. Alam
We propose a model for the Intuitionistic Fuzzy Normalization, which takes as input a
dataset that may be a combination of both Intuitionistic fuzzy and crisp data. The
function of the (abstract) Intuitionistic Fuzzy interface is to convert the raw dataset
into an IF database - a collection of Intuitionistic Fuzzy and crisp data.
Fig. 2. Proposed Intuitionistic Fuzzy Normalization Model: This figure represents only a part of
our data model. There are other procedures apart from the Normalization procedure that work
on the knowledge / IF rule base.
The core process is the IF Normalization process, which may have more than one sub
– processes. The data stores being used by the normalization procedure are: the
Knowledge base and the IF rule base. The knowledge base contains processed data, and
is the actual data warehouse for our data model. e. g., the medical database acts as the
Knowledge base for storing complete information about the patients that includes- their
personal information, as well as the symptoms and other medical history. There are a set
of constraints that are applied on the Knowledge base in order to preserve the integrity
of our data. The Intuitionistic Fuzzy rule base contains every single rule that is applied
by the Normalization procedure on the Knowledge base to achieve the desired results.
We are working on a variant of the above proposed IF normalization model which
helps us improve the Knowledge base as well. Discussing the technicalities of the
same is beyond the scope of this paper. The idea is that the IF rule base is being
updated every time it learns a new pattern or a new query and we end up creating a
machine learning model, which normalizes the knowledge base and at the same time
learns and upgrades its IF rules using some machine learning tools.
One of the most effective techniques for comparing and determining Fuzzy
dependencies is the Equality measure [9]. Equality between any two or more sets can
always be expressed by comparing the elements of each set with another. The equality
measure plays an important role in defining dependencies among the attributes and
tuples.
Equality measure can be of types:
a) Syntactic
b) Semantic
The Syntactic equality is used for crisp comparisons, where information is complete
and the sets to be compared are absolutely comparable (element by element).
The Semantic equality measure is relevant to us however. It is used to compare
sets on the basis of approximation or similarity.
We define the Intuitionistic fuzzy Equality and use the same to define Intuitionistic
Fuzzy dependencies. We denote it as IEQ (t1, t2), where t1,t2 ∈ U and A1 through An
belong to an Intuitionistic Fuzzy relation R.
IEQ( t1,t2) = { (t1,t2),µ IEQ (t1,t2), νIEQ(t1,t2) } (1)
where, µ IEQ (t1,t2) = min{µ IEQ (t1[A1]), t2[A1]),...,µ IEQ(t1[An]), t2[An]) } (2)
νIEQ(t1,t2) = max{νIEQ(t1[A1]),t2[A1]),....,νIEQ(t1[An]), t2[An]) } (3)
1- NF (IF) check
1 NF (IF) states: All the tuples present in the IF database must have a unique
identifier. This condition holds in R, therefore 1NF (IF) holds.
2- NF (IF) check
2- NF (IF) states: The relation must be in 1NF-IF and every non key attribute must be
fully dependent on the IF key. Since, there is a partial dependence of a non –key
316 A.R. Shora and A. Alam
attribute B on the part of the key A, R is not in 2- NF (IF) i.e., Ap,qB. Thus, we need
to decompose the relation R into two relations:
R (A, C, D, E, F) and R1 (A, B) using (I)
3- NF (IF) check
3- NF (IF) states: The relation must be in 2-NF (IF) and all the IF- determinants in the
relation must be all keys and if not keys, the determined attributes must be a part of
key i. e., Non transitive IF dependence.
The determinants in R and R1 are all keys and if not keys, the determined attributes
are a part of key. This is the condition that rules out transitivity. Hence, there is no
need to carry out the transitivity check
BCNF – IF check
BCNF – IF states: An Intuitionistic Fuzzy relational database is in BCNF-IF iff:
a) It is in 3NF – IF
b) For every determinant K Ɛ U, such that
K ab X, in R where X Ɛ U, implies K is a pq - candidate key for relation R.
The dependency Ap,qB doesn’t violate any BCNF (IF) rules as A is a pq –key for
the relation R1. Similarly, ACr,sDE is acceptable. But the dependency Dt,uC
creates a violation as D is not a key attribute. So, we need to decompose the relation
R again, so as to achieve BCNF (IF).The final decomposition is as follows:
As we have seen with the crisp databases, BCNF is always stronger than 3NF.
Same is the case with Intuitionistic Fuzzy relational databases. Both 3NF –IF and
BCNF- IF behave the same till there is only one candidate key. Once we have more
than one overlapping candidate keys, there arise chances of redundancy once again.
BCNF (IF) takes care of this situation.
7 Conclusion
Normalization is a complex task in case of Intuitionistic Fuzzy databases. The data is
not in a standard or conventional form and handling functional dependencies becomes
even more difficult. There are challenges that we overcome in this procedure e.g.
isolating the Intuitionistic Fuzzy part from the crisp one, then search for a pq-
candidate key, identifying dependencies, transitivity, etc.
Further, the violation of constraints in the IF relations needs to be figured out. This
helps us optimize the knowledge base in use. So that the underlying interfaces and
procedures find it easy and quick to gather data and apply relevant inference rules on
the same. Since the knowledge bases are huge and complex entities, we need pre-
defined constraints and normalization techniques to keep our data maintained in
Data Dependencies and Normalization of Intuitionistic Fuzzy Databases 317
References
1. Alam, A., Ahmad, S., Biswas, R.: Normalization of Intuitionistic Fuzzy Relational
Database. NIFS 10(1), 1–6 (2004)
2. Hussain, S., Alam, A., Biswas, R.: Normalization of Intuitionistic Fuzzy Relational
Database into second normal form- 2NF (IF). International J. of Math. Sci. & Engg.
Appls. 3(III), 87–96 (2009)
3. Hussain, S., Alam, A.: Normalization of Intuitionistic Fuzzy Relational Database into third
normal form- 3NF (IF). International J. of Math. Sci. & Engg. Appls. 4(I), 151–157 (2010)
4. Attanasov, K.T.: Intuitionistic Fuzzy sets. Physica Verlag (1999)
5. Shora, A.R., Alam, M.A., Siddiqui, T.: Knowledge-driven Intuitionistic Fuzzy Decision
Support for finding out the causes of Obesity. International Journal on Computer Science
and Engineering 4(3) (2012)
6. Buckles, B.P., Petry, F.E.: A fuzzy representation of data for relational databases. Fuzzy
sets and Systems 7(3), H213–H226 (1982)
7. De, S.K., Biswas, R., Roy, A.R.: Intuitionistic Fuzzy Database. In: Second International
Conference on IFS, pp. 34–41 (1998)
8. Codd, E.F.: Recent Investigations into Relational Data Base Systems. IBM Research
ReportRJ1385 (1974)
9. Raju, K.V.S.V.N., Majumdar, A.K.: Fuzzy Functional dependencies and lossless join
decomposition of Fuzzy Relational database systems. ACM Transactions on Database
Systems 13(2), 129–166 (1988)
10. Jyothi, S., Babu, M.S.: Multivalued dependencies in Fuzzy relational databases and loss
less join decomposition. Fuzzy Sets and Systems 88, 315–332 (1997)
11. Vucetic, M., Hudecb, M., Vujosevica, M.: A new method for computing fuzzy functional
dependencies in relational database systems. Expert Systems with Application 40(7),
2738–2745 (2013)
12. Shora, A.R., Alam, M.A., Biswas, R.: A comparative Study of Fuzzy and Intuitionistic
Fuzzy Techniques in a Knowledge based Decision Support. International Journal of
Computer Applications 53(7) (2012)
13. Kumar, D.A., Al-adhaileh, M.H., Biswas, R., Al-adhaileh, M.H., Biswas, R.: A Method of
Intuitionistic Fuzzy Functional Dependencies in Relational Databases. European Journal of
Scientific Research 29(3), 415–425 (2009)
14. Shenoi, S., Melton, A., Fan, L.T.: Functional dependencies and normal forms in the fuzzy
relational database model. Information Sciences 60(1-2), 1–28 (1992)
15. Deschrijver, G., Kerre, E.E.: On the composition of Intuitionistic fuzzy relations. Fuzzy
Sets and Systems 136, 333–361 (2003)
16. Umano, M.: FREEDOM-O. In: Gupta, M.M., Sanchez, E. (eds.) A Fuzzy Database
System. Fuzzy Information and Decision Processes, pp. 339–349. North Holland,
Amsterdam (1982)
17. Hamouz, S.A., Biswas, R.: Fuzzy Functional Dependencies in Relational Databases.
International Journal of Computational Cognition 4(1) (2006)
18. Atanassov, K.T.: Intuitionistic Fuzzy Sets Past, Present and Future. In: 3rd Conference of
the European Society for Fuzzy Logic and Technology (2003)
318 A.R. Shora and A. Alam
19. Boran, F.E.: An integrated intuitionistic fuzzy multicriteria decision making method for
facility location selection. Mathematical and Computational Applications 16(2), 487–496
(2011)
20. Beaubouefa, T., Petry, E.F.: Uncertainty modeling for database design using Intuitionistic
and rough set theory. Journal of Intelligent & Fuzzy Systems 20 (2009)
21. Yana, L., Ma, Z.M.: Comparison of entity with fuzzy data types in fuzzy object-oriented
databases. Integrated Computer-Aided Engineering, 199–212 (2012)
22. Bosc, P., Pivert, O.: Fuzzy Queries Against Regular and Fuzzy Databases. Flexible Query
Answering Systems, 187–208 (1997)
Fuzzy Logic Based Implementation
for Forest Fire Detection
Using Wireless Sensor Network
1 Introduction
Forest fire is a major problem due to the destruction of forests and generally
wooden nature reserves [2]. A WSN is a group of specialized transducers and
associated controller units which make up a system of wireless sensor nodes
deployed in some geographical area, that autonomously monitor physical or en-
vironmental conditions and send the collected information to a main controller
for certain action to be taken [1]. One major constraint of sensor nodes is limited
battery power. This proposed system is self sufficient in maintaining a regular
power supply that is provided by the solar systems.
This work proposes a real time forest fire detection method using fuzzy logic
based implementation for Wireless Sensor Network (WSN). In this proposed
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 319
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_38, c Springer International Publishing Switzerland 2014
320 M. Dutta, S. Bhowmik, and C. Giri
model a number of sensor nodes are densely deployed in a forest. These sen-
sor nodes collect the variations of temperature, humidity, light intensity, CO2
density, in its vicinity throughout its lifetime and send to the nearby cluster
head to forward aggregated data to the central sink node. Use of fuzzy logic can
make real time decisions without having specific information about the event
[10]. Since this technique deals with linguistic values of the controlling variables
in a natural way instead of logic variables, it is highly suitable for applications
with uncertainties. The sensed data is fed to the fuzzy inference engine to infer
the possibility of forest fire.
The rest of the paper is organised as follows. Section 2 discusses the related
works done so far in this area. A short discussion on the problem statement is
given in Section 3. In Section 4 our proposed solution for forest fire detection
is explained in detail. Section 5 gives a brief overview of network topology for
energy efficient WSN. Section 6 evaluates the performance of the proposed model
followed by a conclusion in Section 7.
2 Related Works
In [1], the authors discussed the general causes of increase in frequency of forest
fires and describes the architecture of wireless sensor network and a scheme
for data collection in real-time forest fire detection. The authors proposed two
algorithms in [2] for forest fire detection. The proposed algorithms are based
on information fusion techniques. The first algorithm uses a threshold method
and the other uses Dempster- Shafer theory. Both algorithms reported false
positives when the motes were exposed to direct sunlight. However, if the motes
are covered to avoid direct sunlight exposure, the number of false positives may
be reduced.
In [3], the authors have presented how to prevent the forest fires using wireless
sensor network by the design of a system for monitoring temperature and hu-
midity in the environment.The respective values of temperature and humidity in
the presence of fire is required. Authors in [4] estimated total carbon release and
carbon dioxide, carbon monoxide, and methane emissions through the analysis
of fire statistics from North America and satellite data from Russia.
The authors in [5] proposed an improved approach to track forest fires and to
predict the spread direction with WSNs using mobile agents.
In [6], the authors have discussed the causes of environmental degradation
in the presence of forest and rural fires. The authors have developed a multi
sensor scheme which detects a fire, and sends a sensor alarm through the wire-
less network to a central server. In [7], Zhang et al. discussed a wireless sensor
network paradigm based on a ZigBee technique. Environmental parameters such
as temperature, humidity, light intensity in the forest region monitored in real
time. The authors in [9] have introduced a wireless sensor network paradigm for
real-time forest fire detection where neural network is applied for data process-
ing. Compared with the the traditional satellite-based detection approach the
wireless sensor network can detect and forecast forest fire more promptly. In [11],
Fuzzy Logic Based Implementation for Forest Fire Detection 321
3 Problem Statement
This paper aims to design a system for early detection of forest fires. It is very
important to predict the direction in which the forest fires are going to spread
as they can spread quickly[4]. Availability of air, heat and fuel are the main
parameters that initiate a fire in a forest. The moisture content of the com-
bustible material plays an important role in assessment and prediction of forest
fire. The moisture content is related with relative humidity in the atmosphere,
wind, temperature of the air and similar factors while relative humidity affects
water evaporation. The physical properties of the combustible materials vary
indirectly by air temperature. Temperature, humidity, light intensity also vary
with time and weather. Hence, in this work we have assumed the parameters like
temperature, humidity, light intensity, CO2 density, time for forest fire detection
using fuzzy logic based implementation.
The proposed solution for forest fire detection uses the concept of Fuzzy Logic
System (FLS). The necessary steps used for detecting the probability of fire using
FLS are fuzzification, fuzzy rules, fuzzy inference system and defuzzification
process [8]. The steps are as follows:
Step 1: In the first step each crisp input is transformed into fuzzy input
which is called fuzzification. The inputs measured by the sensor nodes in our
proposed solution are the crisp inputs such as temperature, humidity, light in-
tensity, CO2 density and time. For each of the crisp inputs a specific range is
defined. The main reason to choose these input variables are that the actual
heat, moisture i.e humidity, light intensity and smoke i.e CO2 density are the
main parameters in a forecast of forest fire. Crisp inputs that are passed into
the control system for processing have its own group of membership functions.
The group of the membership functions for each FIS (Fuzzy Inference System)
variable i.e the input fuzzy variables and the output fuzzy variables is defined
322 M. Dutta, S. Bhowmik, and C. Giri
in Table 1. Each of these linguistic values represent one fuzzy set and can be
defined by a membership function (MF).
Temperature, relative humidity and CO2 density are the most important
weather factors that can be used to predict and monitor fire behaviour. The dan-
ger from firebrands was lower if the ambient air temperature was below 15 ◦ C.
The corresponding temperature values in forest in the presence of fire are 100 ◦ C
- wood is dried, 230 ◦C - releases flammable gases, 380 ◦ C - smoulder, 590 ◦C -
ignite [12]. So we take the upper range of temperature value as 590 ◦ C. There
is a medium probability for a spotfire occurring when the relative humidity is
below 40 percent. If the amount of CO2 in the air of a forest is above 500ppm
then only we can predict that fire has ignited [4]. The minimum and extreme
light intensity in the forest when fire occurs are 500lux and 10000lux [12]. The
variations in the values of these input variables in different time in day as well
as seasonal variations can affect the detection process. For example the normal
temperature value in the noon is much higher than the value in night. So if the
sensors give same temperature value for both in day and night then it provides
more information of occurrence of fire in the night. Similar situations arise for
light intensity, CO2 density. Likewise the different values of temperature in dif-
ferent seasons like summer, winter, spring can affect the inference system. But
in this work seasons are not included.
The proposed Mamdani fuzzy inference system of forest fire detection includ-
ing all input and output variables is shown in Fig. 1. The proposed membership
Fuzzy Logic Based Implementation for Forest Fire Detection 323
functions for all the input and output linguistic variables are shown in figures
from Fig. 2 to Fig. 7.
From Fig. 2, it is noted that if the temperature is 110 ◦ C then the member-
ship grade is calculated from the high and very high membership function and
the rules associated with these two membership functions determines the output.
Step 2: In this step fuzzy reasoning is used to map input space to the output
space. This process is called fuzzy inference process. The fuzzy inference pro-
cess involves membership functions, logical operations and If-Then rules. Here
we use Mamdani fuzzy inference method [8] for decision making. A set of fuzzy
rules are defined which are a collection of linguistic statements that describe how
the fuzzy inference system should make a decision regarding classifying an input
or controlling an output. A series of IF-Operator-THEN rules evaluate the rule.
We may assume that temperature and light intensity are very high, humidity
is very low, CO2 density is high and time is before noon. The output achieved
by these assumptions is addressed to very high, i.e, for these inputs probability
of fire is very high. Consequently new rules can be produced. In our proposed
324 M. Dutta, S. Bhowmik, and C. Giri
control algorithm [14] which can be used in an efficient way to send the different
parameters sensed by the sensors to the base station through cluster heads and
connector nodes. This communication model extends the life of the network by
saving the energy of the sensors at a particular instant by using only few sensors
that are active for communication.
7 Conclusion
This paper, investigates the use of fuzzy logic in determining the probability
of forest fire using multiple sensors. Some vagueness related to the different
environmental conditions can easily be handled by this proposed method. It
gives accurate and robust result with variation of temperature, humidity, etc as
all the input variables are defined by real time data.
References
1. Abraham, A., Rushil, K.K., Ruchit, M.S., Ashwini, G., Naik, V.U.: G, N.K.: Energy
Efficient Detection of Forest Fires Using Wireless Sensor Networks. In: Proceedings
of International Conference on Wireless Networks (ICWN 2012), vol. 49 (2012)
2. Diaz-Ramirez, A., Tafoya, L.A., Atempa, J.A., Mejia-Alvarez, P.: Wireless Sensor
Networks and Fusion Information Methods for Forest Fire Detection. In: Proceed-
ings of 2012 Iberoamerican Conference on Electronics Engineering and Computer
Science, pp. 69–79 (2012)
3. Lozano, C., Rodriguez, O.: Design of Forest Fire Early Detection System Using
Wireless Sensor Networks. The Online Journal on Electronics and Electrical Engi-
neering 3(2), 402–405
4. Kasischke, E.S., Bruhwiler, L.P.: Emissions of carbon dioxide, carbon monox-
ide, and methane from boreal forest fires in 1998. Journal of Geophysical Re-
search 108(D1) (2003)
5. Vukasinovic, I., Rakocevic, G.: An improved approach to track forest fires and
to predict the spread direction with WSNs using mobile agents. In: International
Convention MIPRO 2012, pp. 262–264 (2012)
6. Lloret, J., Garcia, M., Bri, D., Sendra, S.: A Wireless Sensor Network Deployment
for Rural and Forest Fire Detection and Verification. In: Proceedings of Interna-
tional Conference on IEEE Sensors, pp. 8722–8747 (2009)
7. Zhang, J., Li, W., Han, N., Kan, J.: Forest fire detection system based on a ZigBee
wireless sensor network. Journal of Frontiers in China 3(3), 359–374 (2008)
8. Jang, J.-S.R., Sun, C.-T., Mizutani, E.: Neuro-fuzzy and soft computing. PHI
Learning
9. Yu, L., Wang, N., Meng, X.: Real-time Forest Fire Detection with Wireless Sensor
Networks. In: Proceedings of International Conference on Wireless Communica-
tions, Networking and Mobile Computing, vol. 2, pp. 1214–1217 (2005)
Fuzzy Logic Based Implementation for Forest Fire Detection 327
10. Sridhar, P., Madni, A.M., Jamshidi, M.: Hierarchical Aggregation and Intelligent
Monitoring and Control in Fault-Tolerant Wireless Sensor Networks. International
Journal of IEEE Systems 1(1), 38–54 (2007)
11. Bolourchi, P., Uysal, S.: Forest Fire Detection in Wireless Sensor Network Using
Fuzzy Logic. In: Fifth International Conference on Computational Intelligence, pp.
83–87 (2013)
12. Bradstock, R.A., Hammill, K.A., Collins, L., Price, O.: Effects of weather, fuel
and terrain on fire severity in topographically diverse landscapes of south-eastern
Australia. Landscape Ecology 25, 607–619 (2010)
13. Bhowmik, S., Giri, C.: Energy Efficient Fuzzy Clustering in Wireless Sensor Net-
work. In: Proceedings of Ninth International Conference on Wireless Communica-
tion & Sensor Networks (2013)
14. Bhowmik, S., Mitra, D., Giri, C.: K-Fault Tolerant Topology Control in Wireless
Sensor Network. In: Proceedings of International Symposium on Intelligent Infor-
matics (2013)
Fuzzy Connectedness Based Segmentation
of Fetal Heart from Clinical Ultrasound Images
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 329
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_39, c Springer International Publishing Switzerland 2014
330 S. Sampath and N. Sivaraj
there exists a significant need for computerized algorithm to detect the fetal
heart structures from the ultrasound image sequences.
Owing to the favorable merit of the non-invasive nature of ultrasound modal-
ity, it is most commonly used in inferring health status of growing fetus in the
mothers’ womb. The inherent demerits of fetal heart ultrasound images which
complicates the diagnostic interpretation are low signal to noise ratio, anatomic
complexity and dynamic nature of fetal heart. Speckle noise is inherent in clin-
ical ultrasound images which makes it very difficult to interpret fine diagnostic
facets and limits the detectability of low contrast lesions approximately by a
factor of eight [10]. The biological structures in ultrasound images seems to ap-
pear with missing boundaries for the reason of poor contrast. Only because of
these demerits, application of computerized algorithms are difficult for ultra-
sound image analysis. As the object boundaries are most important aspects for
interpretation of images, delineation of the specific objects is utterly useful in
realizing computerized tool to automatically recognize specific ultrasound fetal
heart structures.
The prominent objective of Ultrasound speckle suppression technique is to
better preserve the image edge structures while suppressing the undesired vi-
sual effects of speckle pattern. Making suitable assumption about the statisti-
cal behavior of speckle noise in ultrasound images helps in contriving effective
speckle suppression technique. Numerous statistical models have been reported
in the literature to model the speckle pattern. One such statistical noise model
is Nakagami-Rayleigh joint probability density function. At this juncture, it is
optimal to use a despeckling algorithm which follows the assumption of mod-
elling the speckle pattern in terms of Nakagami-Rayleigh joint probability den-
sity function namely Probabilistic patch based Weighted Maximum Likelihood
Estimation (PPBMLE) proposed by Charles et al [4]. Maximum Likelihood es-
timation is a parameter estimation technique holding substantial importance
in statistical estimation. This patch based filtering is based on nonlocal means
method, as pixel values far apart can be averaged with Patch similarity based
weights together to generate best estimate of the noise free pixel.
Though multitude of segmentation techniques were reported in the literature,
detecting the cardiac structures from fetal ultrasound image remains still diffi-
cult. Lassige et al [8] utilized the level set segmentation technique to delineate
the septal defects present in fetal heart chambers. Enhanced version of level set
segmentation algorithm utilizing shape prior information is used by Dindoyal et
al [9] to segment and extract the fetal heart chambers. The proposed method
involves the investigation of Fuzzy connectedness based image segmentation for
extracting the fetal heart structures. This method is basically a region growing
method of object segmentation [7] by considering the pixel properties of spatial
relationship and intensity similarities, which needs users involvement to select
seed points. The segmentation results of this method gives high valued fuzzy
connectedness map defined by the strength of hanging togetherness of pixels
and appropriate object extraction can be performed with reference to the seed
points. This paper is organized as follows. Section 2 describes the methodology
Fuzzy Connectedness Based Image Segmentation 331
used in the proposed work with PPBMLE based preprocessing and FC based
image segmentation scheme. Section 3 presents the quantitative validation for
FC based segmentation scheme with and without PPBMLE preprocessing tech-
nique. Section 4 describes the results of the proposed work implemented in both
phantom ultrasound image and clinical ultrasound images.
2 Methodology
The proposed work involves the extraction of Region of interest (ROI), use of
PPBMLE based despeckling to remove the speckle noise and Fuzzy connected-
ness based image segmentation to detect the structure of fetal heart from four
chamber view of ultrasound plane. The scheme of the proposed work is illustrated
in the flowchart shown in Fig.1.
Extraction of ROI
Fuzzy Connectedness
based image segmentation
MLE framework given in equation (3) generalized with appropriate PPB weights
is defined by Weighted Maximum Likelihood Estimation (WMLE) method. It
intends to reduce the mean square error of the estimate and is given by
(MLE) vt
s
R ∼
= argmaxRs w(s, t)log (4)
Rs
S∈Ω
The similarity measure applied for estimating true intensity from noisy pixel
intensities is called as Probabilistic patch based (PPB) weights. The weights of
the estimate is expressed by two image region patches Δs and Δt with close
probability assuming equal intensity values for center pixel. The weights defined
by patch based similarity is denoted by
w(s, t)(P P B) ∼ ∗
= p(RΔ s
∗
= RΔ t
v)(1/h) (5)
∗ ∗
where RΔ s
and RΔ denotes sub-image extracted from the parameter image R∗
t
in the corresponding windows of Δs and Δt and h is a scalar parameter. The
scalar parameter depends on the size of weight.
Fig. 2. (a,b,c,d) Fetal heart ROI extracted from clinical raw ultrasound images; (e,f,g,h)
Despeckled images using PPBMLE; (i,j,k,l) Selection of seed points to aid segmenta-
tion; (m,n,o,p) Fuzzy connectedness based segmentation resulting delineation of fetal
heart structures
an object of interest which needs to be delineated. Here, the user needs to select
the seed spel from the fetal heart structure.
Considering a sample image consisting of numerous fuzzy subsets, they are
characterized by the strength of pixel elements assigned in the range of 0 to 1
in terms of fuzzy membership values or Degree of membership μF (s, t) with as
maximum gray scale intensity value of k = 255. It can be denoted by
μF (s, t) = C F (s, t)
/k (6)
The concept of Fuzzy connectedness effectively captures fuzzy hanging to-
getherness of objects in the image [7]. It is realized by the property of hanging
togetherness among spels in a scene C and is called as Local hanging togeth-
erness and it can also be extended over entire image to define Global hanging
334 S. Sampath and N. Sivaraj
Fig. 3. (a) Phantom ultrasound image with speckle noise; (b) PPBMLE despeckling
result; (c) Seed point selection; (d) Fuzzy connectedness based segmented image
togetherness. These are defined by spel properties. For instance, an object com-
prises of fuzzy spels with the property of homogeneity nature and it remains
connected with adjacent fuzzy spels. Boundaries and edges are characterized by
the fuzzy spels of heterogeneity nature and it remains unconnected with the
adjacent fuzzy spels. The concept of fuzzy affinity relations are utilized to define
global hanging togetherness called as Fuzzy connectedness.
Fig. 4. (a) Segmented Phantom ultrasound image without the use of PPBMLE based
despeckling; (b,c) Segmented clinical ultrasound fetal heart images without the use of
PPBMLE based despeckling
2
√ 2 1
2
, if i=1 (si − ti )2 ≤ 2
μα (s, t) = 1+k i=1 (si −ti ) ≤2 (8)
0, otherwise
1
μφ (s, t) = e−1/2[( 2 (f (s)+f (t))−mi )/vi ] (9)
1
μψ (s, t) = e−1/2[( 2 (f (s)+f (t))−mg )/vg ] (10)
The value of affinity nears 0 for non-adjacent spels and ranges from 0 to 1
for adjacent spels. Fuzzy connectivity details among spels are represented as
connectivity map where the object of interest is extracted by thresholding the
image at specified threshold. The computation of fuzzy connectivity image is
carried out by initiating from the selected one or more seed points within the
object of interest. The statistical measures like mean and standard deviation of
the selected seed points are computed separately for extracting the features of
the specified object of interest. The fuzzy k-component based object extraction
with specified threshold value is defined as
1, ifμk (s, t) ≥ θ ∈ [0, 1]
μk (s, t) = (11)
0, otherwise
Fig. 5. (a) Manually segmented ground truth of phantom images; (b and c) Manually
segmented ground truth of clinical ultrasound images
|D1 ∩ D2 |
DSC(D1 , D2 ) = (12)
|D1 ∪ D2 |
336 S. Sampath and N. Sivaraj
The TC coefficient is also used to measure the overlap of two regions namely
D1 and D2 . TC is defined as the ratio of pixel quantity in both manually seg-
mented image and algorithmically segmented image to the quantity of pixels in
union of both.
D1 · D2
T C(D1 , D2 ) = 2 2 (13)
D1 + D2 − D1 · D2
5 Conclusion
Experimental results of various image processing techniques adapted in the pro-
posed work clearly delineates fetal heart structures from the raw clinical ultra-
sound images. Before applying these techniques, the ultrasound image was more
Fuzzy Connectedness Based Image Segmentation 337
References
1. Hoffman, J.I., Kaplan, S.: The incidence of congenital heart disease. J. American
College of Cardiology Foundation 147, 1890–1900 (2002)
2. Rychik, J., Ayres, N., Cuneo, B., Spevak, P.J., Van Der Veld, M.: American So-
ciety of Echocardiography Guidelines and Standards for performance of the fetal
Echocardiogram. Journal of the American Society of Echocardiography, 803–810
(July 2004)
3. Goodman, J.: Some fundamental properties of speckle. Journal of Optical Society,
1145–1150 (1976)
4. Deledalle, C.-A.: Loic Denis and Florence Tupin: Iterative Weighted Maximum
Likelihood Denoising with Probabilistic Patch based weights. IEEE Trans. in Image
Processing, 2661–2672 (December 2009)
5. Alvares, L., Mazorra, L.: Signal and Image Restoration using Shock Filters and
anisotropic diffusion. SIAM Journal of Numerical Analysis, 590–695 (1994)
6. Meghoufel, A., Cloutier, G., Crevier-Denoix, N., de Guise, J.A.: Tissue Characteri-
zation of Equine Tendons with clinical B-Scan Images using a Shock filter thinning
algorithm. IEEE Trans. on Medical Imaging, 596–605 (March 2011)
7. Udupa, J.K., Samarasekera, S.: Fuzzy connectedness and object definition: Theory,
Algorithms and applications in image segmentation, Graphical models and image
processing. Graphical Models and Image Processing 59(3), 246–261 (1996)
8. Lassige, T.A., Benkeser, P.J., Fyfe, D., Sharma, S.: Comparison of septal defects in
2D and 3D echocardiography using active contour models. Computerized Medical
Imaging and Graphics 6, 377–388 (2000)
9. Dindoyal, I., Lambrou, T., Deng, J., Todd-Pokropek, A.: Level set snake algorithms
on the fetal heart. In: 4th IEEE International Symposium on Biomedical Imaging,
pp. 864–867 (April 2007)
10. Wagner, R.F., Smith, S.W., Sandrik, J.M., Lopez, H.: Statistics of speckle in ul-
trasound B-Scans. IEEE Transactions on Sonics and Ultrasonics 30(3), 156–163
(1983)
An Improved Firefly Fuzzy C-Means (FAFCM)
Algorithm for Clustering Real World Data Sets
Abstract. Fuzzy c-means has been widely used in clustering many real world
datasets used for decision making process. But sometimes Fuzzy c-means
(FCM) algorithm generally gets trapped in the local optima and is highly sensi-
tive to initialization. Firefly algorithm (FA) is a well known, popular metaheu-
ristic algorithm that simulates through the flashing characteristics of fireflies
and can be used to resolve the shortcomings of Fuzzy c-means algorithm. In
this paper, first a firefly based fuzzy c-means clustering and then an improved
firefly based fuzzy c-means algorithm (FAFCM) has been proposed and their
performance are being compared with fuzzy c-means and PSO algorithm. The
experimental results divulge that the proposed improved FAFCM method per-
forms better and quite effective for clustering real world datasets than FAFCM,
FCM and PSO, as it avoids to stuck in local optima and leads to faster
convergence.
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 339
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_40, © Springer International Publishing Switzerland 2014
340 J. Nayak et al.
0<∑ w ,∀ (1)
where 1 ≤ i ≤ c , 1 ≤ k ≤ n
From the above defined definitions, it can be found that the elements can fit into
more than one cluster with different degrees of membership. The total “membership”
of an element is normalized to 1 and a single cluster cannot contain all data points.
The objective function (Eq. 2) of the fuzzy c-means algorithm is computed by using
membership value and Euclidian distance (Eq. 3).
J (W, P) = ∑ (w ) (d ) (2)
where
(d ) = ||x p || (3)
An Improved FAFCM Algorithm for Clustering Real World Data Sets 341
where m ∈ (1, ∞) is the parameter which defines the fuzziness of the resulting
clusters and d is the Euclidian distance from object x to the cluster center p .
The minimization [7]of the objective function J through FCM algorithm is being
performed by the iterative updation of the partition matrix using the Eq. 4 and Eq. 5.
p =∑ (w ) x ⁄∑ (w ) (4)
( ) ( ) ⁄
( )
w =∑ 1⁄ (d d ) (5)
The steps of FCM algorithmare as follows:
1. Initialize the number of clusters c.
2. Select an inner product metric Euclidean norm and the weighting metric m
(fuzziness)
3. Initialize the cluster prototype P ( ) , iterative counter b=0.
4. Then calculate the partition matrix W ( ) using (5).
5. Update the c fuzzy cluster centers P ( ) using (4).
6. If P ( ) P ( ) <ε then stop, otherwise repeat step 2 through 4.
Fireflies are glowworms that glow through bioluminescence. The firefly algorithm is
based on the idealized behavior of the flashing characteristics of fireflies. Three rules
has been described [8] for the basic principle of firefly algorithm .
1. All fireflies are unisexual in nature so that one firefly will be fascinated to other
fireflies despite of their sex.
2. Attractiveness is proportional to their brightness, thus for any two flashing fireflies,
the less bright one will move towards the brighter one and if their distance increas-
es, they both decreases. If there is no brighter one than a particular firefly, it will
move randomly.
3. The brightness of a firefly is affected or determined by the landscape of the objec-
tive function.
The firefly algorithm is a population based algorithm to locate the global optima of
objective functions. In the firefly algorithm fireflies are randomly distributed in the
search space. A firefly attracts other neighboring fireflies by its tight intensity. The
attractiveness of a firefly is dependent on the brightness of the firefly. The brightness
is dependent on the intensity of light emitted by the firefly. The intensity is calculated
using the objective function. The intensity is inversely proportional to the distance r
between two fireflies, 1/ . Each firefly represents a candidate solution such that
the fireflies move towards the best solution [9] after each iteration as the firefly with
best solution has the brightest glow (intensity).
The firefly algorithm has two important issues: the variation of light intensity and
formulation of the attractiveness. The attractiveness of the firefly can be determined
by its brightness or light intensity [10] which in turn is associated with the encoded
objective function. The objective function in this case is described by (Eq.2).Based on
342 J. Nayak et al.
this objective function initially, all the fireflies are randomly dispersed across the
search space. The two phases of firefly algorithm are as follows :
(1) Variation of light intensity: The objective function values are used to find
the light intensity. Suppose there exist a swarm of n fireflies and xi represents
a solution for firefly i, whereas f(xi) denotes the fitness value in Eq. 6.
Ii= f (xi), 1 ≤ i ≤ n. (6)
(2) Movement towards attractive firefly: The attractiveness of a firefly is pro-
portional to the light intensity [11] seen by adjacent flies. Each firefly has its
distinctive attractiveness which describes how strong a firefly attracts oth-
er members of the swarm. But the attractiveness is relative, it will vary
with the distance (Eq. 7) between two fireflies i and j located at and ,
respectively.
r =|x x | (7)
The attractiveness function ( ) of the firefly is determined by Eq. 8.
β(r) = β e (8)
Where is the attractiveness at r=0 and is the light absorption coefficient.
The movement of a firefly i at location xi attracted to a more attractive firefly j at
location xj is given by Eq. 9.
xi(t+1)=xi(t)+β e x x α (rand 0.5) (9)
The pseudo code for this algorithm is given as follows.
In this work the firefly algorithm applied on the FCM to overcome the shortcomings
of the FCM algorithm. The firefly algorithm has been applied on the objective func-
tion of FCM algorithm given in Eq. (2).Here the goal is to minimize the objective
function of FCM clustering. In the proposed method the cluster centers are used as the
decision parameters to minimize the objective function. So, in the context of cluster-
ing, a single firefly represents the cluster centers vectors. Each firefly is represented
by using Eq. 12.
x = (p , … , p , … , p ), 2 ≤ j ≤c, (12)
wherep represents the jth clustering center vector. Hence, a swarm of n fireflies
represents n candidate solutions.As this is a minimization problem the intensity of
each firefly is equal to the value of the objective function of FCM . The pseudo code
for the FAFCM algorithm is being illustrated as follows.
Initialize the population of n fireflies with C random cluster centers of d-dimensions each
Initialize algorithm’s parameters
Repeat
For i=1: n
For j=1: n
Calculate light intensity (objective function value) of each firefly by eq.(2),
If (Ij<Ii)
Move firefly i toward j based on eq.(9) to update the position of fireflies
End if
End for j
End for i
Rank the fireflies and find the current best
Do Until stop condition true
Rank the fireflies, obtain the global best and the position of the global best( cluster centers)
344 J. Nayak et al.
After the experimental analysis, it is found the cluster centers can be further refined
using the improved FAFCM algorithm, which helps in further minimization of the
objective function. The experimental results show that for smaller datasets FAFCM
provides better results as compared to FCM but with increase in the size of datasets
FCM surpass FAFCM. In the proposed improved algorithm, the clustering is per-
formed in two stages. In the first stage the firefly has been initialized with random
values and after a fixed number of iterations the cluster centers is obtained. Then in
second stage the obtained cluster centers are used as the initial cluster centers for the
FCM algorithm to get the refined cluster centers which gives a more minimized ob-
jective function value. The pseudo code of the proposed improved FAFCM clustering
algorithm has been described as follows.
6 Experimental Analysis
The terminating condition in FCM algorithm is when there is no scope for further
improvement in the objective function value. The FAFCM stopping condition is 500
generations (maximum no. of iterations) or no changes in current best in 5 consecu-
tive iterations. In the improved FAFCM algorithm the terminating condition for
FAFCM is the maximum number of generations i.e. 50 or no changes in current best
in 3 consecutive iterations. The terminating condition for FCM in improved FAFCM
is same as the FCM algorithm mentioned above.
Table 1. Comparison of objective function value and number of iterations for the iris dataset
for various clustering algorithms
Algorithm Objective Function Value Average Number of itera-
Best Mean tions
FCM 60.58 62.63 24
PSO 60.58 61.07 56.25
FAFCM 60.57 60.58 48.15
Improved FAFCM 60.57 60.57 15.75
Table 2. Comparison of objective function value and number of iterations for the glass dataset
for various clustering algorithms
Table 3. Comparison of objective function value and number of iterations for the Lung Cancer
dataset for various clustering algorithms
Algorithm Objective Function Value Average Number of
Best Mean Iterations
Table 4. Comparison of objective function and number of iterations for the Single Outlier
dataset for various clustering algorithms
Algorithm Objective Function Value Average Number of
Best Mean Iterations
FCM 2.04 2.13 7.75
PSO 1.79 1.79 10.35
FAFCM 1.78 1.79 6
Improved FAFCM 1.78 1.78 5
80
80 400
180
78
78
350
76
76 160
74
O b je c tiv e F u nc tion v a lu e
74 300
O b je c tiv e Fu nc tio n v a lu e
O b je c t iv e f u n c t io n v a lu e
140
72 72
250
70 70
120
68 200
68
66 66 100
150
64 64
80
100
62 62
60 60 50 60
0 50 100 150 200 250 0 50 100 150 200 250 0 20 40 60 80 100 120 140 160 180 200 0 5 10 15
Iterations Iterations Iterations Iterations
330 380
360
360 360
320
340
O b j e c ti v e f u n c t io n v a lu e
O b je c t iv e f u n c ti o n v a lu e
O bj ec t iv e f u nc t io n v a lu e
O b j e c t iv e f u n c t io n v a lu e
340 310
320 320
300
320 300
290
280 280
300 280
260
270
240 240
280
260 220
2.45
2.1
1.88 1.88
2.4
2.05
2.35
O b j e c t iv e F u n c t io n v a lu e
O b je c t iv e F u n c t io n v a lu e
O b je c t iv e fu n c t io n v a lu e
O b je c t v e F u n c t io n v a lu e
1.86 1.86
2
2.3
2.2
1.9
1.82 1.82
2.15
1.85
2.1
1.8 1.8
1.8
2.05
have been illustrated in Fig. 1, Fig. 2, and Fig. 3. From the above results, it is apparent
that the improved FAFCM gives better and steady results as compared other cluster-
ing algorithms in case of all the data sets considered.
This paper first proposed a firefly based fuzzy c-means algorithm and then an im-
proved FAFCM clustering and the experimental results shows that the improved
FAFCM method performs better than other three algorithms. Fuzzy c-means cluster-
ing is a very popular clustering algorithm with a wide variety of real world applica-
tions, but FCM uses the hill climbing method of search which traps it in the local
optima and is also sensitive to initialization. Hence, we have used Firefly; a nature
inspired meta-heuristic optimization technique has been used to avoid these problems.
After the experimental analysis, it is found that the improved FAFCM shows steady
and best results for various data sets considered as compared to the FCM, PSO and
FAFCM. The improved FAFCM algorithm leads to faster convergence and mini-
mized objective function value. In future the proposed improved FAFCM can be
reformed into hybridization of the current algorithm with different optimization
methods, for better performance.
References
1. Izakian, H., Abraham, A.: Fuzzy C-means and fuzzy swarm for fuzzy clustering problem.
Expert Systems with Applications 38, 1835–1838 (2011)
2. Bezdek, J.C.: Pattern recognition with fuzzy objective function algorithms, pp. 95–107.
Plenum Press, New York (1981)
3. Li, L., Liu, X., Xu, M.: A Novel Fuzzy Clustering Based on Particle Swarm Optimization.
In: First IEEE International Symposium on Information Technologies and Applications in
Education, pp. 88–90 (2007)
4. Wang, L., et al.: Particle Swarm Optimization for Fuzzy c-Means Clustering. In: Proceed-
ings of the 6th World Congress on Intelligent Control and Automation, Dalian, China
(2006)
5. Yang, X.S.: Nature-Inspired Metaheuristic Algorithms. Luniver Press (2008)
348 J. Nayak et al.
6. Senthilnath, J., Omkar, S.N., Mani, V.: Clustering using firefly algorithm: Performance
study. Swarm and Evolutionary Computation 1, 164–171 (2011)
7. Runkler, T.A., Katz, C.: Fuzzy Clustering by Particle Swarm Optimization. In: Proceed-
ings of 2006 IEEE International Conference on Fuzzy Systems, Canada, pp. 601–608
(2006)
8. Zadeh, T.H., Meybodi, M.: A New Hybrid Approach for Data Clustering using Firefly Al-
gorithm and K-means. In: 16th CSI International Symposium on Artificial Intelligence and
Signal Processing (AISI), Fars, pp. 007-011 (2012)
9. Abshouri, A.A., Bakhtiary, A.: A New Clustering Method Based on Firefly and KHM.
Journal of Communication and Computer 9, 387–391 (2012)
10. Yang, X.-S.: Firefly Algorithms for Multimodal Optimization. In: Watanabe, O., Zeug-
mann, T. (eds.) SAGA 2009. LNCS, vol. 5792, pp. 169–178. Springer, Heidelberg (2009)
11. Yang, X.S.: Firefly Algorithm, Stochastic Test Functions and Design optimization. Inter-
national Journal of Bio-Inspired Computation 2, 78–84 (2010)
12. Yang, F., Sun, T., Zhang, C.: An efficient hybrid data clustering method based on K-
harmonic means and Particle Swarm Optimization. Expert Systems with Applications 36,
9847–9852 (2009)
13. Niknam, T., Amiri, B.: An efficient hybrid approach based on PSO, ACO and k-means for
cluster analysis. Applied Soft Computing 10, 183–197 (2010)
14. Huang, K.Y.: A hybrid particle swarm optimization approach for clustering and classifica-
tion of datasets. Knowledge-Based Systems 24, 420–426 (2011)
15. Chakravarty, S., Dash, P.K.: A PSO based integrated functional link net and interval type-
2 fuzzy logic system for predicting stock market indices. Applied Soft Computing 12(2),
931–941 (2012)
16. Shayeghi, H., Jalili, A., Shayanfar, H.A.: Multi-stage fuzzy load frequency control us-
ing PSO. Energy Conversion and Management 49(10), 2570–2580 (2008)
On Kernel Based Rough Intuitionistic Fuzzy C-means
Algorithm and a Comparative Analysis
Abstract. Clustering of real life data for analysis has gained popularity and
imprecise methods or their hybrid approaches has attracted many researchers of
late. Recently, rough intuitionistic fuzzy c-means algorithm was introduced and
studied by Tripathy et al [3] and it was found to be superior to all other
algorithms in this family. Kernel based counter part of these algorithms have
been found to behave better than their corresponding Euclidean distance based
algorithms. Very recently kernel based rough fuzzy algorithm was put forth by
Bhargav et al [4]. A comparative analysis over standard datasets and images has
established the superiority of this algorithm over its corresponding standard
algorithm. In this paper we introduce the kernel based rough intuitionistic fuzzy
c-means algorithm and show that it is superior to all the algorithms in the
sequel; i.e. both normal and the kernel based algorithms. We establish it
through experimental analysis by taking different type of inputs and using
standard accuracy measures.
Keywords: clustering, fuzzy sets, rough sets, intuitionistic fuzzy sets, rough
fuzzy sets, rough intuitionistic fuzzy sets, DB index, D index.
1 Introduction
Clustering is the process of putting similar objects into groups called clusters and
putting dissimilar objects into different clusters. In contrast to classification where
labeling of a large set of training tuples patterns is necessary to model the groups,
clustering starts with partitioning the objects into groups first and then labeling the
small number of groups. The large amount of data collected across multiple sources
makes it practically impossible to manually analyze them and select the data that is
required to perform a particular task. Hence, a mechanism that can classify the data
according to some criteria in which only the classes of interest are selected and rests
are rejected is essential. Clustering techniques are applied in the analysis of statistical
data used in fields such as machine learning, pattern recognition, image analysis,
information retrieval, and bioinformatics and is a major task in exploratory data
mining. A wide number of clustering algorithms have been proposed to suit the
requirements in each field of its application.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 349
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_41, © Springer International Publishing Switzerland 2014
350 B.K. Tripathy et al.
There are several clustering methods in literature starting with the Hard C-Means
(HCM) [10]. In order to handle uncertainty in data, several algorithms like the fuzzy
c-means (FCM) [14] based on the notion of fuzzy sets introduced by Zadeh [19],
rough c-means (RCM) [9] based on the concept of rough sets introduced by Pawlak
[13], the intuitionistic fuzzy c-means (IFCM) [5] based upon the concept of
intuitionistic fuzzy sets introduced by Atanassov [1] have been introduced. Also,
several hybrid c-means algorithms like the rough fuzzy c-means (RFCM)[11,12]
based upon the rough fuzzy sets introduced by Dubois and Prade [7] and rough fuzzy
intuitionistic fuzzy c-means (RIFCM) [3] based upon the rough intuitionistic fuzzy
sets introduced by Saleha et al [15] have been introduced. In [3] it has been
experimentally established through comparative analysis that RIFCM performs better
than the other individual or hybrid algorithms.
Distance between objects can be calculated in many ways, the Euclidean distance
based clustering is easy to implement and hence most commonly used. It has two
drawbacks; the final results are dependent on the initial centres and it can only find
linearly separable cluster. Nonlinear mapping functions used transforms the nonlinear
separation problem in the image plane into a linear separation problem in kernel space
facilitating clustering in feature space. Kernel based clustering helps in rectifying the
second problem as it produces nonlinear separating surfaces among clusters [20, 21].
Replacing the Euclidean distance used in the above algorithms for computing the
distance by kernel functions some algorithms have been put forth like the kernel
based fuzzy c-means (KFCM) [20] and the kernel based rough c-means (KRCM) [17,
18, 21] were introduced Tripathy and Ghosh. Very recently, the kernel based rough
fuzzy c-means [4] was introduced and studied by Bhargav and Tripathy. In this paper
we introduce the kernel based rough intuitionistic fuzzy c-means algorithm
(KRIFCM) algorithm and provide a comparative analysis of these kernel based as
well as the standard c-means algorithm by focusing on RFCM, RIFCM, KRFCM and
KRIFCM. Also, we use the Davies-Bouldin (DB) [6] and Dunn (D) indexes [8] to
compare the accuracy of performance of these algorithms. We use different types of
images and datasets for the purpose of experimental analysis. It has been observed
that KRIFCM has the best performance among all these methods.
The paper is divided into seven sections. In section 2 we present some definitions
and notations to be used throughout the paper. In Section 3, we deal with c-means
clustering algorithms with emphasis on the IFCM algorithm. The proposed kernel
based rough intuitionistic fuzzy c-means (KRIFCM) clustering algorithm along is
explained in section 5. The complete evaluation is shown in section 6. Evaluation has
been performed on synthetic dataset, real dataset and on image dataset. Finally,
section 7 concludes the paper.
μ X ( x) for any x ∈ U is a real number in [0, 1], called the membership value of x in
X. The non-membership value ν X ( x) is defined as ν X ( x ) = 1 − μ X ( x ) .
d (vi , vk )
Dunn = min i min k ≠ i , for1 < k , i, l < c (2)
max i S (vi )
3 Kernel Methods
Distance between objects can be calculated in many ways, the Euclidean distance
based clustering is easy to implement and hence most commonly used. It has two
drawbacks, firstly the final results are dependent on the initial centers and secondly it
can only find linearly separable cluster. Kernel based clustering helps in rectifying the
second problem as it produces nonlinear separating hyper surfaces among clusters.
Kernel functions are used to transform the data in the image plane into a feature plane
of higher dimension known as kernel space.
Nonlinear mapping functions used transforms the nonlinear separation problem in
the image plane into a linear separation problem in kernel space facilitating clustering
in feature space. Mercer’s theorem can be used to calculate the distance between the
pixel feature values in kernel space without knowing the transformation function.
d ( x, y ) = ( x1 − y1 ) 2 + ( x2 − y2 ) 2 + ... + ( xn − yn ) 2 (3)
Here, N is total number of data objects [17]. According to [20, 21] kernel distance
function D(x, y) in the generalized form is D(x, y) = K(x, x) + K(y, y) − 2K(x, y) and on
applying the property of similarity (i.e., K(x, x) = 1) it can be further reduced to (5).
On Kernel Based Rough Intuitionistic Fuzzy C-means Algorithm 353
In this section we shall present the intuitionistic fuzzy algorithm, whose concepts are
used in describing the kernel based rough fuzzy intuitionistic fuzzy algorithm.
(μ ′ )
1 1 m
A1 = x j ∈ BU i
x j a nd B 1 = x j ∈ B N (U i ) ij x j ;
BU i n1
(μ ′ )
m
w h ere n1 = x j ∈ B N (U i ) ij
(10)
If Dik = 0 or x j ∈ BU i then =1
Else compute using
1
μ ik = 2
.
D ik m −1
c
j =1
D jk
4. Compute using (7).
′
5. Compute using equation (8) and normalize.
6. Let ′ and ′ be the maximum and next to maximum membership values
of object to cluster centroids and .
If ′ ′
then
∈ and ∈ and cannot be a member of any lower
approximation.
Else ∈
7. Calculate new cluster means by using
w lo w A + w u p B if BU i ≠ φ a n d B N (U i ) ≠ φ
V i = B if BU i = φ a n d B N (U i ) ≠ φ
A E L S E
′
m
xk μ ik xk
x k ∈ BU i x k ∈ BN (U i )
Where A= ,B = m
.
′
μ ik
BU i
x k ∈ BN (U i )
8. Repeat from step 2 until termination condition is met or until there are no
more assignment of objects.
It aims at maximizing the between-cluster distance and minimizing the within-cluster
distance. Hence a greater value for the D index proves to be more efficient.
6 Experimental Analysis
The evaluation of the algorithm has been done in 2 parts. We have implemented the
algorithms and used two types of inputs for the purpose. The first type of input is
the zoo dataset, which is numeric by character and is taken from the UCI repository.
The second type of inputs comprises of three different kinds of images; the cell, the
iris and a football player.
We have made a comparison of four algorithms; RFCM, RIFCM, KRIFCM and
KRIFCM and used two of the well known accuracy measures; the DB index and the
D index to measure their efficiency. In [3] it was established by us that RIFCM is the
best among the family of algorithms; HCM, FCM, RCM, IFCM, RFCM and RIFCM.
Again, it was shown in [20, 21] and [17, 18] respectively that the kernel versions
KFCM and KRCM perform better than their normal counterparts. In [ ] again we
356 B.K. Tripathy et al.
established that KRFCM is more efficient than RFCM. So, our comparative analysis
in this paper establishes that KRFCM is the best among all these 11 algorithms.
Even though the proposed as well as existing algorithms have been applied on the
cell or iris image for the purpose of comparison of their efficiency, they can also be
used in various other fields where clustering of data is required or even were pattern
recognition is a must. Further improvements by modifying threshold values can be
easily done so as to apply these algorithms in industries which work on data analysis.
So far we have observed that the computations of the complexities of the C-Means
algorithms are not found in the literature. However, it is obvious that the
computational complexities of the hybrid algorithms are definitely more than the
individual algorithms. But the computational complexity of the RFCM, RIFCM,
KRFCM and KRIFCM algorithms are same. Similarly the computational
complexities of the FCM, IFCM, KFCM and KIFCM algorithms are same. But the
comparison of the computational complexities of all the algorithms can be done. Also,
one can compare the efficiency gained through the hybridization algorithms at the
cost of increase in the computational complexities.
1(a) Oiginal Image 1(b) KRFCM 1(c) KRIFCM 1(d) RFCM 1(e) RIFCM
Fig. 1.
2(a) Oiginal Image 2(b) KRFCM 2(c) KRIFCM 2(d) RFCM 2(e) RIFCM
Fig. 2.
On Kerneel Based Rough Intuitionistic Fuzzy C-means Algorithm 357
7 Conclusion
This paper focuses on the development of a kernel based rough intuitionistic fuzzy
algorithm which is established to be the most efficient among the 11 clustering c-
means algorithms including the hard c-means, 5 uncertainty based extensions of it; the
FCM, IFCM, RCM, RFCM and RIFCM as well the 5 corresponding kernel based
algorithms; the KFCM, KIFCM, KRCM, KRFCM and KRIFCM. All the algorithms
have been tested against three different type of images; the Ronaldo image, the cell
image and the iris image. Also a standard dataset, the soybean dataset is considered.
Two indices of measuring accuracies; the DB index and the D index are used for the
comparison purpose. Also, the number of clusters was varied to be 3, 4 and 5. In
almost all the cases it is observed that KRIFCM has better accuracies than its
counterparts. Also, we have tried to put some comments on the computational
complexities of these families of algorithms.
References
1. Atanassov, K.T.: Intuitionistic Fuzzy Sets. Fuzzy Sets and Systems 20(1), 87–96 (1986)
2. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer
Academic Publishers (1981)
3. Bhargav, R., Tripathy, B.K., Tripathy, A., Dhull, R., Verma, E., Swarnalatha, P.: Rough
Intuitionistic Fuzzy C-Means Algorithm and a Comparative Analysis. In: ACM
Conference, Compute 2013, pp. 978–971 (2013) 978-1-4503-2545-5/13/08
4. Bhargava, R., Tripathy, B.: Kernel Based Rough-Fuzzy C-Means. In: Maji, P., Ghosh,
A., Murty, M.N., Ghosh, K., Pal, S.K. (eds.) PReMI 2013. LNCS, vol. 8251, pp. 148–
155. Springer, Heidelberg (2013)
5. Chaira, T., Anand, S.: A Novel Intuitionistic Fuzzy Approach for Tumor/Hemorrhage
Detection in Medical Images. Journal of Scientific and Industrial Research 70(6) (2011)
6. Davis, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Transactions on Pattern
Analysis and Machine Intelligence PAMI-1(2), 224–227 (1979)
On Kernel Based Rough Intuitionistic Fuzzy C-means Algorithm 359
7. Dubois, D., Prade, H.: Rough fuzzy sets model. International Journal of General
Systems 46(1), 191–208 (1990)
8. Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact
well-separated clusters, pp. 32–57 (1973)
9. Lingras, P., West, C.: Interval set clustering of web users with rough k-mean. Journal of
Intelligent Information Systems 23(1), 5–16 (2004)
10. Macqueen, J.B.: Some Methods for classification and Analysis of Multivariate
Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics
and Probability, pp. 281–297. University of California Press (1967)
11. Maji, P., Pal, S.K.: RFCM: A Hybrid Clustering Algorithm using rough and fuzzy set.
Fundamenta Informaticae 80(4), 475–496 (2007)
12. Mitra, S., Banka, H., Pedrycz, W.: Rough-Fuzzy Collaborative Clustering. IEEE
Transactions on System, Man, and Cybernetics, Part B: Cybernetics 36(4), 795–805
(2006)
13. Pawlak, Z.: Rough sets. Int. Jour. of Computer and Information Sciences 11, 341–356
(1982)
14. Ruspini, E.H.: A new approach to clustering. Information and Control 15(1), 22–32
(1969)
15. Saleha, R., Haider, J.N., Danish, N.: Rough Intuitionistic Fuzzy Set. In: Proc. of 8th Int.
Conf. on Fuzzy Theory and Technology (FT & T), Durham, North Carolina (USA),
March 9-12 (2002)
16. Sugeno, M.: Fuzzy Measures and Fuzzy integrals-A survey. In: Gupta, M., Sardis, G.N.,
Gaines, B.R. (eds.) Fuzzy Automata and Decision Processes, pp. 89–102 (1977)
17. Tripathy, B.K., Ghosh, A., Panda, G.K.: Kernel based K-means clustering using rough
set. In: 2012 International IEEE Conference on Computer Communication and
Informatics, ICCCI (2012)
18. Tripathy, B.K., Ghosh, A., Panda, G.K.: Adaptive K-Means Clustering to Handle
Heterogeneous Data Using Basic Rough Set Theory. In: Meghanathan, N., Chaki, N.,
Nagamalai, D. (eds.) CCSIT 2012, Part I. LNICST, vol. 84, pp. 193–202. Springer,
Heidelberg (2012)
19. Zadeh, L.A.: Fuzzy Sets. Information and Control 8(11), 338–353 (1965)
20. Zhang, D., Chen, S.: Fuzzy Clustering Using Kernel Method. In: Proceedings of the
International Conference on Control and Automation, Xiamen, China, pp. 123–127
(2002)
21. Zhou, T., Zhang, Y., Lu, H., Deng, F., Wang, F.: Rough Cluster Algorithm Based on
Kernel Function. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A.,
Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 172–179. Springer,
Heidelberg (2008)
FuSCa: A New Weighted Membership Driven Fuzzy
Supervised Classifier
Abstract. The aim of this paper is to introduce a new supervised fuzzy classifi-
cation methodology (FuSCa) to improve the performance of k-NN (k-Nearest
Neighbor) algorithm based on the weighted nearest neighbor membership and
global membership derived from the training dataset. In this classification me-
thod, the test object is assigned a class label having the maximum membership
value for that corresponding class while a weighted membership vector is found
after utilizing the Global and Nearest-Neighbor fuzzy membership vectors
along with a global weight and a k-close weight respectively. FuSCa is com-
pared with other approaches using the standard benchmark data-sets and found
to produce better classification accuracy.
1 Introduction
Classification algorithms are designed to learn a function which maps a large vector
of attributes into one of several classes. In supervised classification model, some
known objects are described by a large set of vectors where each vector is composed
of a set of attributes and a class label which specifies the category the objects. Super-
vised k-Nearest Neighbor (k-NN) classification algorithm has been rigorously studied
and modified worldwide in last decades to improve its accuracy and efficiency. Al-
though the idea behind k-NN algorithm is very simple, it is widely applied in many
real life applications due to its better accuracy rate.
A new method (FuSCa) has been proposed in this paper to improve the accuracy of
k-NN algorithm in classification problems by deriving two membership vectors,
Global Membership Vector (GMV) and K-Close Membership Vector (KMV) depend-
ing on the whole training dataset and nearest neighbor instances for each test data
respectively. A Class Determinant Function (CDF) is imposed on GMV and KMV to
get a Weighted Membership Vector (WMV) which represents the degree of belonging
a test data record to each class.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 361
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_42, © Springer International Publishing Switzerland 2014
362 P. Das, S. Sivasathya, and K. Joshil Raj
2 Preliminary
2.1 Basics of Classification Problem
Supervised classification method builds a concise model of the distribution of class
labels in terms of predictor features then it assigns class labels to the testing instances
where the values of the predictor features are known, but the value of the class label is
unknown. Let that each instance i of a given dataset is described by both a vector of n
attribute values xi=[xi1,xi2 ,…, xin] and its corresponding class label yi , which can
take any value from a set of values Y={y1,y2, …, ym}. Thus, xiq specifies the value of
the q-th attribute of the i-th instance. The training and test datasets are represented by
DTRAIN and DTEST respectively. DNN contains k- Nearest Neighbor records based on
the distance between x∈DTEST and xi∈ DTRAIN. Therefore, DNN ⊆ DTRAIN and
Where h, k, r represents the total number of records in DTRAIN, DNN, DTEST and k may be
any arbitrary value between 1 and h. Each test data record from DTEST will be assigned
a class label based on the prediction provided by the classifier using the knowledge of
the training set DTRAIN.
Numerous methods such as, neural networks [2], [11], support vector machines [2],
[11], rough sets[8], [9], decision trees[2], fuzzy sets[1], [4], Bayesian Net-
works[12],rule based classifiers[18] etc. have been proposed in Machine Learning
literatures for effective classification tasks. Among them, the nearest neighbor based
classifier i.e; k-NN is used in real life applications due to its simplicity and standard
performance. According to Joaquín Derrac et al. [1], the first fuzzy nearest neighbor
classifier was introduced by Jów´ik [3] in 1983 where every neighbor uses its fuzzy
membership (array) to each class for the voting rule which brings the final classifica-
tion result. Later in 1985, Keller et al. [4] proposed a new fuzzy membership based
classification method popularly known as Fuzzy KNN which uses three different
methods for computing the class memberships. The best performing method is consi-
dered for each instance x (x is in class i) to compute the k- nearest neighbors from the
training data. F. Chung-Hoon et al. [5] proposed a new type-2 fuzzy k-nearest neigh-
bor method in 2003. In 2005, T.D. Pham[6], [7] introduced the kriging computational
scheme for deter-mining optimal weights to be combined with different fuzzy mem-
bership grades. S. Hadjitodorov [7] applied intuitionistic fuzzy set theory to develop
the fuzzy nearest neighbor classifier with the non-membership concept in 1995. On
the other hand, R. Jensen et al.[8] in 2011 presented the FRNN-FRS and FRNN-
VQRS techniques for fuzzy-rough classification which employ the fuzzy rough sets
and vaguely quantified rough sets, respectively. Recently in 2012, M. Sarkar [9] pro-
posed a fuzzy-rough uncertainty based classification method which incorporates the
lower and upper approximations of the memberships to the decision rule. Another
FuSCa: A New Weighted Membership Driven Fuzzy Supervised Classifier 363
recent work from Feras Al-Obeidat et al.[10-13] used a new methodology named as
PSOPRO based on the fuzzy indifference relationship and optimized its parameters
using PSO (Particle Swarm Optimization) algorithm.
Fuzzy methods which have been used to produce fuzzy classification rules are also
proved to be effective in supervised machine learning. For example, Emel Kızılkaya
Aydogan et al. [18] introduced a fuzzy rule-based classifier (FRBCS) with hybrid
heuristic approach (called hGA) to solve high dimensional classification problems in
linguistic fuzzy rule-based classification systems in 2012. All of these methods have
shown effective improvements over k-NN algorithm.
The k-nearest neighbor [4], [11] algorithm computes the distance or similarity be-
tween each test example z = (x’, y’) and all the training examples (xi,yj) DTRAIN to
determine its nearest neighbor list, DNN. The test example is classified based on the
distance weighted voting using equation (1).
y’ = ∑ , ∈ ( = ) (1)
where v is a class label, yj is the class label for one of the nearest neighbors and I(.)
is an indicator function that returns the value 1 if its argument is true and 0 otherwise.
Distance between x and xi is d(x ,xi) and corresponding weight, =
’ ’
( , )
Let a conventional crisp subset A of an universal set of objects (U) is commonly de-
fined by specifying the objects of the universe which are the members of A. The cha-
racteristic function of A may specify an equivalent way of defining A as,µ : U
→{0,1} where for all x ∈U. Mathematically, it can be expressed as,
1 ∈A
(x) = (2)
0
In fuzzy membership system[4], characteristic functions are generalized to produce
a real value in the interval [0, 1]. i.e.; µ : U →[0,1].
Let the set of sample vectors is{x1, x2, ….., xn}. The degree of membership of each
vector in each of c classes is specified by a fuzzy c partition. So, It is denoted by a
c×n matrix U, where the degree of membership of xk in class i is µ =µ (xk) for
i=1,2…,c; and k=1,2,…,n. The following properties hold for U to be a fuzzy partition.
∑ =1 µ = 1 ; 0 ∑ µ ; µ ∈ 0,1 (3)
Suppose the training data(DTRAIN), test data (DTEST), and nearest neighbor data(DNN)
sets is defined as described in section-2.1. The adjusted weight, η ( ) , for each
364 P. Das, S. Sivasathya, and K. Joshil Raj
distance d (Euclidean Distance) between test data x to training data xiis formulated
below in equation(4). Here is the Euclidean distance between test data and ith train-
ing data. The fuzzy membership vectors (global and k-close) for each test data are
found using these adjusted weights.
( ) = 1/(1 ) (4)
Where, = ∑ ( )
For each test data we define the Global Membership Vectorand K-Close Member-
ship Vector as GMV=(GM1, GM2, ……..GMm) and KMV=(KM1, KM2, ……..KMm)
respectively. Here, m is the number of classes present in training dataset. Each mem-
bership value in GMV and KMV follows the properties listed in equation (3). These
fuzzy membership vectors are evaluated using equations (5) and (6) given below.
∑( , ) D ( )
TRAIN
= (5)
∑ ( )
where, ∑( , ) DTRAIN η (x)= Total adjusted weight of jth class label records in DTRAIN
∑ η (x) = Total adjusted weight of all class label records in DTRAIN.
∑( , ) DNN ( )
= (6)
∑ ( )
Where, ∑( , ) DNN η (x) = Total adjusted weight of jth class label records in DNN.
∑ η (x) = Total adjusted weight of all class label records in DNN.
and ‘h’ and ‘k’ are the total number of records in DTRAIN and DNN respectively.
AWeighted Membership Vector (WMV)is found after deriving the GMV and KMV
vectors for the test data x∈DTEST using a special Class Determinant Function(CDF)
given in equation(7). It indicates a Determinant Function Vector, DFV as (F1, F2,
……, Fm).
< DFV >= wg × < GMV > + wk × < KMV > (7)
Where, Fj= wg × GMj + wk × KMj. Here,‘×’ represents the scalar multiplication with
vectors and ‘+’ denotes the vector addition.wg∈ (0,1) and wk∈ (0,1) are the global
weight and k-close weight respectively. These two variables define the weightage
given to global membership vector and k-close membership vector respectively. The
Weighted Membership Vector (WMV) = (g1, g2,….,gm) may be achieved by normaliz-
ing the values in Determinant Function Vector, DFV.
The last step is to assign each test data record, x∈DTEST to the right class yjapplying
the following rulegiven in equation (8). Where,j ∈ {1,2 . . . , m}.
x∈yj if and only if max(g1, g2,….,gm)=gj (8)
FuSCa: A New Weighted Membership Driven Fuzzy Supervised Classifier 365
4 Experimental Analysis
The proposed FuSCa classification algorithm is implemented in java using the open
source java packages of WEKA3.7.9 [14] machine learning tool developed by Waika-
to University. Performance of FuSCa is evaluated on 25 benchmark data sets from the
UCI repository [15]. In this paper, only the results, obtained from six popular UCI
datasets, are displayed.
To evaluate the performance of the proposed FuSCa classifier, 10-fold cross valida-
tion is used for each standard benchmark dataset. The accuracy results are compared
with other standard machine leaning classifiers and displayed in Table 1. These are
C4.5 J48, 1-NN, 3-NN, 5-NN, SMO, SMO with Polynomial-Kernel(SVM2), SMO
with RBF-Kernel (SVMG) and logistic regression (LogR). In this case, we used
WEKA [14] open source implementations for this purpose. As observed from these
results, FuSCa performs well on most of the benchmark dataset.
To compare the multiple classifiers, Friedman test [16], [17] is considered as one of
the best non-parametric statistical test methods. The following steps determine the
Friedman rankings of multiple classifiers.
a. Collect the results observed for each pair of algorithm and data set.
b. For each observed data set i, ranks are defined as an integer value from 1
(best algorithm) to k (worst algorithm). Ranks are denoted as (1 ≤ ≤
). Where k is the number of classifiers used in comparison.
c. For each algorithm j, average the ranks obtained for all datasets to obtain the
final rank denoted by, = ∑
366 P. Das, S. Sivasathyaa, and K. Joshil Raj
Friedman Ranking
8 6.28 5.68
Average Rank 5.48 5.68 4.94 5.52
6 4.3 4.18
2.94
4
2
0
Classifiers
90
1-NN
85
3-NN
80
5-NN
75
SVM
70
SVM2
breast diabetes glass heart iris
Datasets
Fig. 2. Comp
parison of accuracy results of different classifiers
In Fig.2, the graph preseents the classification accuracy percentage predicted by dif-
ferent classifiers for six popu
ular datasets available in UCI Machine Learning repositoory.
5 Conclusion
In this paper, a new method dology is used for classification with fuzzy membershipp to
improve the nearest neighb bor classification technique. This supervised learning teech-
nique is evaluated using 10-fold
1 cross validation among several well-known U UCI
Machine Learning datasetss and compared with some standard algorithms which are
available in open source WEKA
W data mining tool. Non-parametric Friedman statisstic-
al test is applied to comparee FuSCa with multiple classifiers. It was also observed tthat
FuSCa: A New Weighted Membership Driven Fuzzy Supervised Classifier 367
very less additional computational cost is required for this proposed fuzzy mechanism
compared to original KNN method. The results indicate that FuSCa has significantly
improved the performance of the nearest neighbor based classification methods using
the Global and K-close fuzzy membership values.
References
1. Derrac, J., García, S., Herrera, F.: Fuzzy nearest neighbor algorithms: Taxonomy, experi-
mental analysis and prospects. Information Sciences 260, 98–119 (2014)
2. Kotsiantis, S.B.: Supervised Machine Learning: A Review of Classification Techniques.
Informatica 31, 249–268 (2007)
3. Jówík, A.: A learning scheme for a fuzzy k-NN rule. Pattern Recognition Letters 1, 287–
289 (1983)
4. Keller, J.M., Gray, M.R., Givens, J.A.: A fuzzy k-nearest neighbor algorithm. IEEE Trans-
actions on Systems Man, and Cybernetics 15, 580–585 (1985)
5. Chung-Hoon, F., Hwang, C.: An interval type-2 fuzzy k-nearest neighbor. In: Proceedings
of the 12th IEEE International Conference on Fuzzy Systems, pp. 802–807 (2003)
6. Pham, T.D.: An optimally weighted fuzzy k-NN algorithm. In: Singh, S., Singh, M., Apte,
C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 239–247. Springer, Heidelberg
(2005)
7. Hadjitodorov, S.: An intuitionistic fuzzy sets application to the k-NN method. Notes on In-
tuitionistic Fuzzy Sets 1, 66–69 (1995)
8. Jensen, R., Cornelis, C.: Fuzzy-rough nearest neighbour classification. IEEE Transactions
on Rough Sets 13, 56–72 (2011)
9. Sarkar, M.: Fuzzy-rough nearest neighbor algorithms in classification. Fuzzy Sets and Sys-
tems 158, 2134–2152 (2012)
10. Al-Obeidat, F., Belacel, N., Carretero, J.A., Mahanti, P.: An evolutionary framework using
particle swarm optimization for classification method PROAFTN. Applied Soft Compu-
ting 11, 4971–4980 (2011)
11. Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Education,
Inc. (2006)
12. Carvalho, A.M., Roos, T., Oliveira, A.L.: Discriminative Learning of Bayesian Networks
via Factorized Conditional Log-Likelihood. Journal of Machine Learning Research 12,
2181–2210 (2011)
13. Witten, H.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan
Kaufmann Series in Data Management Systems (2005)
14. WEKA software, Machine Learning, The University of Waikato, Hamilton, New Zealand,
http://www.cs.waikato.ac.nz/ml/weka/
15. UCI Machine Learning Repository,
http://mlearn.ics.uci.edu/MLRepository.html
16. Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analy-
sis of variance. Journal of the American Statistical Association 32, 674–701 (1937)
17. Garcıa, S., Herrera, F.: An Extension on “Statistical Comparisons of Classifiers over Mul-
tiple Data Sets” for all Pairwise Comparisons. Journal of Machine Learning Research 9,
2677–2694 (2008)
18. Aydogan, E.K., Karaoglan, I., Pardalos, P.M.: hGA: Hybrid genetic algorithm in fuzzy
rule-based classification systems for high-dimensional problems. Applied Soft Compu-
ting 12, 800–806 (2012)
Choice of Implication Functions to Reduce Uncertainty
in Interval Type-2 Fuzzy Inferences
1 Introduction
Existing approaches to interval type-2 fuzzy reasoning (IT2 FS) [4], [5], [6] usually
consider extension of classical Mamdani type reasoning [3], [7], [9], [10]. There
exists ample scope of research to identify suitable implication functions from the
available list of implications with an aim to reduce the uncertainty in the inferential
space in IT2 fuzzy reasoning. Unfortunately, to the best of our knowledge, there
hardly exists any literature that compares the role of implication functions in reducing
uncertainty in type-2 inferential space. This paper would compare the relative merits
of implication functions in the context of uncertainty reduction in the footprint of
uncertainty in IT2 FS inferences.
Most of the control [11], instrumentation [13], tele-communication [2] and other
real-time applications [15], [16], [17] usually consider an observation of the linguistic
variable in the antecedent space to derive the inference [12]. The paper also considers
that an observation crisp value x = x′ in the antecedent space with an attempt to derive
the inference. Computer simulations undertaken confirm that Lukasiewicz-1 fuzzy
implication function outperforms its competitors with respect to uncertainty [8], [14]
measure induced by left and right end point centroids of IT2 fuzzy inference.
From this point on, the paper is organized as follows. Selection of the most efficient
implication functions that correspond to minimum uncertainty in the interval type-2
inference is given in section 2. The conclusions are listed in section 3.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 369
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_43, © Springer International Publishing Switzerland 2014
370 S. Chakraborty, A. Konar, and R. Janarthanan
in terms of UMF and LMF. Using this metric Z, the SOU for different implication
function has been discussed. Based on this metric Z, the most efficient implication
function has been selected.
~
The IT2 inference is B ′ = [ μ B′ ~ ( y ) ] = [ B′ , B ′ ], where B′ = ( α ∧ ( A
~ ( y ) , μ B′
cr ∞ cr ∞
cr =
x μ ~ ( x )dx +
−∞ A
cr
xμ A~ ( x)dx
μ ~ ( x) dx +
−∞ A
cr
μ A~ ( x) dx .
(5)
Theorem 1: If μ A′~ ( x) < μ A~ ( x) at least for one x = xi and cr < xi < cl, then the SOU is
reduced.
Proof: We have divided the proof in two parts:
i) If μ A′~ ( x) < μ A~ ( x) , at least for one x = xi and cr < xi < cl, then left-end point
centroid increases
Simplifying equation (4), we have
cl ∞
−∞
μ A~ ( x)(cl − x)dx = μ A~ ( x)( x − cl )dx
cl
cl ∞
−∞
μ A′~ ( x)(cl − x)dx < cl
μ ~ ( x)( x − cl )dx , (Since, μ A′~ ( x) < μ A~ ( x) )
A
(7)
cl′ ∞
−∞
μ A~′ ( x)(cl′ − x)dx = ′ μ A~ ( x)( x − cl′ )dx (Let cl′ ≥ cl )
cl
cl′ ∞ cl′ ∞
cl′ =
−∞ A
xμ ′~ ( x)dx +
cl′x μ ~ ( x )dx
A
−∞ A
μ ′~ ( x) dx +
cl′μ ~ ( x)dx
A
(8)
372 S. Chakraborty, A. Konar, and R. Janarthanan
The above equation shows that cl′ is the left-end point of FOU=[ μ A~ ( x) , μ A′~ ( x) ],
where cl′ > cl .
ii) If μ A′~ ( x) < μ A~ ( x) , at least for one x = xi and cr < xi < cl, then right-end
point centroid decreases
Again, simplifying equation (5), we obtain
cr ∞
−∞
μ A~ ( x)(cr − x)dx = μ A~ ( x)( x − cr )dx
cr
cr ∞
−∞
μ A~ ( x)(cr − x)dx > cr
μ A′~ ( x)( x − cr )dx , (Since, μ A′~ ( x) < μ A~ ( x) ) (9)
c′r ∞
−∞
μ A~ ( x)(cr′ − x)dx = ′ μ A′~ ( x)( x − cr′ )dx
cr
(Let c′r < cr )
cl′ ∞ cl′ ∞
c′r =
−∞ A
x μ ~ ( x )dx +
c ′
l
xμ ′A~ ( x )dx
−∞
μ ~ ( x)dx +
A cl′
μ ′A~ ( x)dx (10)
The above equation shows that c′r is the right-end point of FOU=[ μ A~ ( x) , μ A′~ ( x) ],
where c′r < cr .
Thus, for μ A′~ ( x) ≤ μ A~ ( x) , left-end point increases and right-end point decreases,
i.e., SOU is reduced. □
Theorem 2: If μ ′ A~ ( x) > μ A~ ( x) , at least for one x = xi and cr > xi > cl, then the SOU
is reduced.
Proof: Similar to the proof of Theorem 1. □
= α∧[ ∨ (a
∀i
i ∨ (a ) ] ∧b = (α ∧ M ) ∧ b
∧ ai ∧ b j )) ] = α ∧ [
∀i
i j A j , (11)
Now, following equation (6), we obtain the Z for IT2 inference using Mamdani’s
implication as
ZL = [(α ∧ M
∀j
A ∧ b j ) − (α ∧ M A ∧ b j )] (12)
= α∧[ ∨ ((a ∧ (1 − a )) ∨ (a ∧ b )) ]
∀i
i i i j
= α ∧ [{ ∨ (( a ∧ (1 − a )) } ∨ { ∨ (a ∧ b )) }]
i i i j
∀i ∀i
= α ∧ [(A o ¬A ) ∨ ( ∨ (a ) ∧ b
∀i
i j )]
b ′j = α ∧ ([ai] o [rij] ∀i )
= α ∧ ([ai] o [ Min {1, (1- ai + bj)}]) = α ∧ [
∀i ∨ (a
∀i
i ∧ (1 − a i + b j )) ] (15)
Now, following equation (6), we obtain the Z for IT2 inference using Lukasiewicz
implication as
ZL = [{α ∧ (∨ (a ∧ (1 - a + b ))} − {α ∧ (∨ (a
∀j ∀i
i i j
∀i
i ∧ (1 - a i + b j ))} ] (16)
374 S. Chakraborty, A. Konar, and R. Janarthanan
− {α ∧ (∨ (a
∀i
i ∧ ((1 - a i + (1 + λ )b j ) (1 + λ a i )))} ], where λ > −1 . (17)
ZL2 = [{α ∧ (∨ (a
∀j ∀i
i ∧ (1 - ( a i ) w + (b j ) w )1 / w ))}
− {α ∧ ( ∨ (a
∀i
i ∧ (1 - ( a i ) w + (b j ) w )1 / w ))} ], where w > 0 . (18)
Implication
Validity of FOU SOU
Function
∨ ∀i
(ai ∧ (1 - ai + b j ) >
∨
∀i
( ai ∧ (1 - ai + b j )
may not hold always. Thus, sometimes it may result in invalid FOU.
Choice of Implication Functions to Reduce Uncertainty in Interval Type-2 Fuzzy Inferences 375
Example 1: Given antecedent UMF, A = [0.2 0.9 0.9 0.9 0.1] and LMF, A = [0.1
0.3 0.5 0.3 0.1]. Consequent UMF B = [0.2 0.9 0.9 0.9 0.1] and LMF, B =[0.1 0.3
0.5 0.5 0.1], and crisp input value x = x3.
α = A ( x3) = 0.9 and α = A ( x3) = 0.5.
Mamdani inference: B ′ = [0.2 0.9 0.9 0.9 0.1] and B′ = [0.1 0.3 0.5 0.5 0.1].
Conclusion: Valid FOU, Z = 1.5.
Kleen-Dienes inference: B ′ = [0.2 0.9 0.9 0.9 0.2] and B′ = [0.5 0.5 0.5 0.5 0.5].
Conclusion: Invalid FOU as B′ > B ′ .
Lukasiewicz inference: B ′ = [0.3 0.9 0.9 0.9 0.2] and B′ = [0.5 0.5 0.5 0.5 0.5].
Conclusion: Invalid FOU as B′ > B ′ .
Lukasiewicz-1 inference: B ′ = [0.5637 0.9 0.9 0.9 0.5013] and B′ = [0.5 0.5 0.5 0.5
0.5]. Conclusion: Valid FOU and reduced SOU at λ = -0.858, Z = 1.265.
Lukasiewicz-2 inference: B ′ = [0.5282 0.9 0.9 0.9 0.5068] and B′ = [0.5 0.5 0.5 0.5
0.5]. Conclusion: Valid FOU and reduced SOU at w = 2.25, Z = 1.235.
Here, we search the suitable value of λ for greater than -1 and w for greater than 0
that can produce minimum Z satisfying B ′ ⊇ B′ . This searching technique is done by
simple Matlab programming.
3 Conclusion
possibility of having the LMF crossing the UMF in the inferential space. While
comparing suitability of implication functions to reduce uncertainty in IT2 inference,
it is observed that Lukasiewicz-1/Lukasiewicz-2 implication are good choices for their
flexibility in controlling the relational surface by a parameter λ and w. Experiments
reveal that for a suitable λ /w in the positive real axis, it yields the reduced Z,
resulting in a reduction in the span of uncertainty.
References
1. Zadeh, L.A.: Fuzzy sets. Information and Control 8(3), 338–353 (1965)
2. Jammeh, E.A., Fleury, M., Wagner, C., Hagras, H., Ghanbari, M.: Interval type-2 fuzzy
logic congestion control for video streaming across IP networks. IEEE Trans. on Fuzzy
Systems 17(5), 1123–1142 (2009)
3. Mendel, J.M., John, R.I.: Type-2 fuzzy sets made simple. IEEE Trans. on Fuzzy
Systems 10, 117–127 (2002)
4. Mendel, J.M.: Advances in type-2 fuzzy sets and systems. Information Sciences 177(1),
84–110 (2007)
5. Mendel, J.M.: Type-2 fuzzy sets and systems: An overview. IEEE Computational
Intelligence (2007)
6. Mendel, J.M., John, R.I., Liu, F.: Interval type-2 fuzzy logic systems made simple. IEEE
Trans. on Fuzzy Systems 14, 808–821 (2006)
7. Wu, D., Mendel, J.M.: Uncertainty measures for interval type-2 fuzzy sets. Information
Sciences 177(23), 5378–5393 (2007)
8. Mendel, J.M., Wu, H.: Type-2 fuzzistics for nonsymmetric interval type-2 fuzzy sets:
forward problems. IEEE Trans. on Fuzzy Systems 15(5), 916–930 (2007)
9. Mendel, J.M., Wu, D.: Perceptual computing: Aiding people in making subjective
judgments, vol. 13. Wiley-IEEE Press (2010)
10. Wu, D., Mendel, J.M.: Linguistic summarization using IF–THEN rules and interval type-2
fuzzy sets. IEEE Trans. on Fuzzy Systems 19(1), 136–151 (2011)
11. Wu, D., Tan, W.: Type-2 fuzzy logic controller for liquid level process. In: Proc. IEEE Int.
Conf. on Fuzzy Systems, pp. 242–247 (2005)
12. Lascio, L.D., Gisolfi, A., Nappi, A.: Medical differential diagnosis through type-2 fuzzy
sets. In: Proc. IEEE Int. Conf. on Fuzzy Systems, pp. 371–376 (2005)
13. Hagras, H.: A hierarchical type-2 fuzzy logic control architecture for autonomous mobile
robots. IEEE Trans. on Fuzzy Systems 12, 524–539 (2004)
14. Coupland, S., John, R.: Geometric type-1 and type-2 fuzzy logic systems. IEEE Tran. on
Fuzzy Systems 15(1), 3–15 (2007)
15. Karnik, N.N., Mendel, J.M.: Centroid of type-2 fuzzy set. Information Sciences 132, 195–
220 (2001)
16. Aguero, J.R., Vargas, A.: Calculating functions of interval type-2 fuzzy numbers for fault
current analysis. IEEE Trans. Fuzzy Syst. 15(1), 31–40 (2007)
17. Juang, C.-F., Huang, R.-B., Lin, Y.-Y.: A recurrent self-evolving interval type-2 fuzzy
neural network for dynamic system processing. IEEE Trans. Fuzzy Syst. 17(5), 1092–1105
(2009)
18. Mendel, J.M., Wu, H.: Type-2 fuzzistics for nonsymmetric interval type-2 fuzzy sets:
forward problem. IEEE Trans. Fuzzy Systems 15(5), 916–930 (2007)
19. Klir, G.J., Yuan, B.: Approximate reasoning: Fuzzy sets and fuzzy Logic (2002)
Detection of Downy Mildew Disease Present in the Grape
Leaves Based on Fuzzy Set Theory
1 Introduction
Grapes are one of the most widely grown fruit crops in the world with significant
plantings in India. Grapes are used in the production of wine, brandy, or non-
fermented drinks and are eaten fresh or dried as raisins. It is a very important cash
crop in India. But sometimes grape plants are affected by downy mildew, a serious
fungal disease caused by Plasmoparaviticola. It is always difficult to be identified by
using naked-eye observation method directly. So, it is difficult to diagnose it
accurately and effectively by using traditional plant disease diagnosis method that is
mainly dependent on naked-eye observation [5]. Downy mildew disease of grape
leaves is among the oldest plant disease known to man. It is a highly destructive
disease of grapevines in all grape-growing areas of the world. Early literature on
grape cultivation mentions this devastating disease and its ability to destroy entire
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 377
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_44, © Springer International Publishing Switzerland 2014
378 D.K. Kole, A. Ghosh, and S. Mitra
grape plantation [4]. Since downy mildew discovery, numerous studies have been
conducted on the life cycles of downy mildew pathogen and its management. The
information gained from these studies has enabled to develop best management
practices that reduce the impact of the diseases. Today, worldwide epidemic losses
are rare, though the diseases can occur at significant levels in particular fields or
throughout a particular growing region. If it is not controlled the powdery mildew
fungus not only affects fruit yield and quality, but it also reduces vine growth and
winter hardiness.
Significant progress has been made in the use of image processing approaches to
detect various diseases in other crops. The work by Boso et al. confirmed that image
processing can provide a means of rapid, reliable and quantitative early detection of
these diseases [1].Weizhen et al. has emphasized that both quality and quantity of
agricultural product are highly reduced by the various plant diseases [10].
A semi-automated image processing method developed by Peresotti et al. (2011) to
measure the development of the downy mildew pathogen by quantifying the
sporulation of the fungi on the grapevine [6]. In order to quantify the sporulation the
image was converted to 8-bit format using median-cut colour quantization [3] and
then the contrast was adjusted to enhance the white sporulated area in the leaf. An
image recognition method reported by Li et al. (2012) for the diagnosis of vines with
downy mildew and powdery mildew disease [5]. Images were pre-processed using
nearest neighbour interpolation to compress the image prior to removal by a median
filter and finally diseased regions were segmented via a K-means [11], [12] clustering
algorithm. Then a SVM classifier was evaluated for performance in disease
recognition.
In this paper, we present efficient technique to classify the downy mildew diseased
and non-diseased grape leaves based on K-means and fuzzy set theory. This proposed
technique consists of two stages. First one is feature reduction stage in which
dominating features are obtained from a list of twenty features. Second stage is the
detection of downy mildew disease present in grape leaves. In the first stage, we
create a feature matrix after calculating all twenty features of the given diseased
images. After that a normalized feature matrix is generated in which K-means
algorithm is applied on individual feature (column). Each cluster is represented by a
linguistic symbol and the fuzzy value of each linguistic symbol (cluster) for each
feature is calculated with respect to total number of diseased images. A fuzzy matrix
is generated from normalized feature matrix. Then fuzzy value of each cluster for
each feature is calculated. Those features are selected whose maximum fuzzy value is
greater than predefined threshold value. In the second stage, we first generate the
values of selected features of all test images. Then each normalized feature value in
the normalized feature matrix is replaced by a fuzzy value of the closest cluster of the
respective feature. The average fuzzy value is calculated for each image. Those
images are detected as downy mildew disease present in grape leaves, whose average
fuzzy value are greater than or equal to the experimental threshold value.
The rest of the paper is organized as follows. Preliminaries are given in Section 2.
Proposed method is described in Section 3. In Section 4, we present experimental
result. Concluding remarks appear in Section 5.
Detection
n of Downy Mildew Disease Present in the Grape Leaves 379
2 Preliminaries
The six most common disseases of grapes are black rot, downy mildew, powddery
mildew, deadarm, crown gall,
g and gray mold. Downy Mildew Fig. 1 is a seriious
fungal disease caused by the
t Plasmoporaviticola. Itattacks the leaves, shoots, frruit,
and their tendrils during th
heir immature stage. Spores formed in the infected parrt of
plants are blown to leaves, shoots, and blossom clusters during cool, moist weatheer in
the spring and early summeer. The fungus attacks shoots, tendrils, petioles, leaf veeins,
and fruit. This disease appeears first on fruits as dark red spots. Then these spots are
gradually are circular, sunk
ken, ashy-gray and in late stages these are surrounded bby a
dark margin. The spots varyy in size from 1/4 inch in diameter to about half the fruitt.
Fig
g. 1. Disease Free and Diseased Images
In e rtia
r = (i −
i j
j ) 2 c (i, j )
(1)
Correlation: It measures the sum of joint probabilities of occurrence of evvery
possible gray-level pair.
(i − μi )( j − μ j )c(i, j )
on =
Correlatio
i j σ iσ j
(2)
380 D.K. Kole, A. Ghosh, and S. Mitra
i j j i i j
σ j = (j− μ j ) 2 c(i, j )
i j
(4)
Energy: Also known as uniformity or the angular second moment, It provides the
sum of squared elements in the matrix.
fixed once and for all [2], [8], [9]. This generally means the concept is vague, lacking
a fixed, precise meaning, without however being meaningless altogether [7].
3 Proposed Technique
The proposed technique consists of two stages: first one is the feature selection stage
based on fuzzy set theory with respect to diseased images from a list of twenty
features and second is the verification stage for disease or non-disease image.
In this phase, we first determine twenty feature values on all diseased images (say, m
number of diseased images). After doing this, a two-dimension matrix having size m
x 20 is generated. Since the range set for different feature are not same, we make the
range set of all features to same range in (0, 1) interval. Thus a normalized featured
matrix is generated to compare any feature with other features. K-means algorithm are
applied on individual column of the normalized featured matrix considering the value
of K=3. After clustering on each column, linguistic symbols are assigned to each
cluster of a particular column where each linguistic symbol represents a cluster of the
corresponding column. Then the frequency of individual linguistic symbol is
calculated. After that the fuzzy value of that linguistic symbol is determined with
respect to total number of diseased images. Store the cluster information (cluster
centers and fuzzy values) for each feature and choose the linguistic symbol
corresponding to each feature having maximum fuzzy value. Those features are
selected whose maximum fuzzy value is greater than predefined threshold value.
The following algorithm describes the feature selection based on fuzzy value.
Step5: Store the cluster information (cluster centers and fuzzy values) for each
feature.
Step6: Select the linguistic symbol corresponding to each feature having maximum
fuzzy value.
Step7: After calculating n maximum fuzzy values, each of which correspond to a
feature, select those features whose maximum fuzzy value is greater than
predefined threshold value. Let us assume p number of features have been
selected.
Step1: Extract those features {f1, f2,…,fp} from the images which have been
determined by theFeature Selection Algorithm. F = (fij)m x p,where fij represent
the jth feature of ith image, p = number of features.
Step2: Load Cluster information corresponding to each feature (generated by Feature
Selection Algorithm)
Step3: For each ith image (i = 1 to m)
3.1 Determine the cluster (corresponding to each feature) to which each
feature value (f1, f2,…, fp) of the ith image must belong to. The feature
value belongs to the cluster from whose center its distance is the smallest.
3.2 fzij = fuzzy value of the cluster to which the jth feature of the ith image
belongs to.
3.3 Calculate fuzzy value of ith image, fvi = (∑ fzij)/p for j= 1 to p.
Detection of Downy Mildew Disease Present in the Grape Leaves 383
Step4: Depending upon a predefined threshold for fuzzy value (fv), images are
marked as diseased or disease free.
4 Experimental Results
We have applied feature selection algorithm on a list of diseased images on fuzzy set
theory to get the most dominating features from a list of twenty features. In this stage,
we have considered the value of k in k means is 3and fuzzy threshold value is 0.5.
This algorithm reduces from twenty features to nine features. In the second stage a list
of images are considered for detection for downy mildew diseases present in grape
leaves. We experimentally got a threshold value 0.47 for which the success rate for
detection of downy mildew disease is 87.09%.We have experimented on 31 images
whose result is shown in the following Table 2. Column 3 of Table 2 represents
number of features with list of feature number selected from feature selection
algorithm.
No. of false
No. of false
Reduction
Reduction
Threshold
Threshold
Detected
negative
positive
Cluster
No. of
No. of
Fuzzy
Truly
after
Success
Rate
9,{2,3,5,10,12.14.1
3 0.5 0.47 2 2 27 87.09%
8,19,20}
5 Conclusion
This paper presents a novel technique for detection of downy mildew disease present
in the grape leaves. The proposed technique uses a proposed feature reduction
algorithm which finds out the most dominating features from a list of features in this
domain based. For that it uses a K-means algorithm and fuzzy set theory concept. The
dominating features are obtained for only diseased images. Those features are
calculated from a list of test images and latter classified by the proposed detection of
diseased images algorithm into diseased and non-diseased images. The above
experimental result shows that our technique yields good results.
Acknowledgement. The authors would like to thank Dr. Amitava Ghosh, ex-
Economic Botanist IX, Agriculture dept., Govt. of West Bengal, presently attached as
Guest Faculty, Dept. of Genetics and Plant Breeding, Institute of Agriculture, Calcutta
University, for providing grape leaves images with scientist’s comments.
384 D.K. Kole, A. Ghosh, and S. Mitra
References
1. Boso, S., Santiago, J.L., Martínez, M.C.: Resistance of Eight Different Clones of the Grape
Cultivar Albariño to Plasmoparaviticola. Plant Disease 88(7), 741–744 (2004)
2. Maurus, V.B., James, N.M., Ronald, W.M., Patrick, F.: Inheritance of Downy Mildew
Resistance in Table Grapes. J. Amer. Soc. Hort. Sci. 124(3), 262–267 (1999)
3. Heckbert, P.: Color image quantization for frame buffer display. In: Proceedings of 9th
Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1982,
pp. 297–307 (1982)
4. Hofmann, U.: Plant protection strategies against downy mildew in organic viticulture. In:
Proceedings of International Congress of Organic Viticulture, pp. 167–174 (2000)
5. Li, G., Ma, Z., Wang, H.: Image recognition of grape downy mildew and grape powdery
mildew based on support vector machine. In: Li, D., Chen, Y. (eds.) CCTA 2011, Part III.
IFIP AICT, vol. 370, pp. 151–162. Springer, Heidelberg (2012)
6. Peressotti, E., Duchêne, E., Merdinoglu, D., Mestre, P.: A semi-automatic non-destructive
method to quantify downy mildew sporulation. Journal of Microbiological Methods 84,
265–271 (2011)
7. Dietz, R., Moruzzi, S.: Cuts and clouds. Vagueness, Its Nature, and Its Logic. Oxford
University Press (2009)
8. Ridler, T.W., Calvard, S.: Picture thresholding using an iterative selection method. IEEE
Transactions on Systems, Man and Cybernetics 8, 630–632 (1978)
9. Haack, S.: Deviant logic, fuzzy logic: beyond the formalism. University of Chicago Press,
Chicago (1996)
10. Weizheng, S., Yachun, W., Zhanliang, C., Hongda, W.: Grading Method of Leaf Spot
Disease Based on Image Processing. In: Proceedings of the 2008 International Conference
on Computer Science and Software Engineering, vol. 6, pp. 491–494 (2008)
11. Samma, A.S.B., Salam, R.A.: Adaptation of K-Means Algorithm for Image Segmentation.
International Journal of Information and Communication Engineering 5(4), 270–274
(2009)
12. Redmond, S.J., Heneghan, C.: A method for initialising the K-means clustering algorithm
using kd-trees. Pattern Recognition Letters 28, 965–973 (2007)
Facial Expression Synthesis for a Desired Degree
of Emotion Using Fuzzy Abduction
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 385
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_45, © Springer International Publishing Switzerland 2014
386 S. Chakraborty et al.
This section aims at determining the desired facial features of a given subject to
ensure a desired degree of specific emotions. We presume that the facial expressions
of the person under consideration for varied degrees of emotion are available. Now,
for an unknown degree of emotion, we need to determine the facial features that
correctly represent the desired facial expression containing the required degree of
emotion. The present problem has been formulated here as an abductive reasoning
problem. There exists several research works on fuzzy abduction. Some of the well-
known techniques in this regard are listed here [12], [7], and [16]. Among these, the
work reported in [12] is the simplest. We here attempt to solve the problem by this
method.
∨ (q
j =1
kj ∧ r jk ) to be close to 1 ( criterion 1)
n
and ∨
j =1, i ≠ k
(q kj ∧ r ji ) to be close to 0 ( criterion 2 )
By [12], we write the choice of qkj, ∀k, j from the set {rj1, rj2, ...., rjk, ...., rjn} is
governed by the following two criteria
The above two criteria can be combined to a single criterion as depicted below
n
(qkj ∧ rjk) - ∨
j =1, i ≠ k
(q kj ∧ r ji ) is to be maximized, where qkj ε {rj1, rj2, ...., rjk, ...., rjn}.
End For;
sort (αw ,βw) || this procedure sorts the elements of the array αw
and saves them in βw in descending order ||
For w:= 1 to n-1
if β1 = βw+1
qkj : = rjw ;
print qkj;
End For;
End For;
End For;
End.
3 Experiments
In this experiment, we have taken fourteen different faces of a person (see Fig. 1).
Now, three steps are given to formulate desired facial features of the parson when
degree of emotion is provided.
The pixel value for end-to-end-lower-lip of these fourteen images are collected as [88
89 91 94 95 99 101 103 105 106 108 108 108 108] (see Fig. 2). Maximum (Mx),
minimum (Mn) and median (Med) pixel value of end-to-end-lower-lip are Mx = 108,
Mn = 88, and Med = 99. Now, the end-to-end-lower-lip is fuzzified into three fuzzy
sets as HIGH, MEDIUM and LOW using the variable end-to-end-lower-lip, Li = [88
90 92 94 96 98 100 102 104 106 108] in the mathematical expression given below.
L − Med L − 99
µ HIGH(Li) = Max 0.1, i = Max 0.1, i (1)
Mx − Med 108 − 99
Li − Mn Li − 88
µ MEDIUM(Li) = = , for Li ≤ 99 (2)
Med − Mn 99 − 88
Mx − Li 108 − Li
= = , for Li > 99
Mx − Med 108 − 99
Med − Li 99 − Li
µ ML-Low(Li) = Max 0.1, = Max 0.1, . (3)
Med − Mn 99 − 88
Facial Expression Synthesis fo
or a Desired Degree of Emotion Using Fuzzy Abduction 389
11.png 12
2.png 13.png 14.png
Fig.. 1. Fourteen different faces of a person
end-to-end-lower-lip
Fig. 2. End-to-end
d-lower-lip measurement from a cropped mouth portion
0.9
µ HIGH
H (Li)
µ LOW(Li) µ LOW(Hi) µ HIGH(Hi)
0.8
0.7
0.6
µ MEDIUM(Li)
µ
0.5
0.4 µ MEDIUM(Hi)
0.3
0.2
0.1
0
88 90 92 94 96 98 100 102 104 106 108
end-to-end-lower-lip, Li
Using fuzzy rules and meembership function we first construct relational matriices
(using Mamdani’s implicatiion). Fuzzy rules and corresponding relational matrices are
390 S. Chakraborty et al.
1 .75 .5 .25 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
1 .75 .5 .25 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
.66 .66 .5 .25 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 .25 .33 .33 .33 .33 .33 .33 .25 0.1 0.1
.33 .33 .5 .25 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 .25 0.5 .66 .66 .66 .66 0.5 .25 0.1 0.1
0. 1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 .25 0.5 .75 1 1 .75 0.5 .25 0.1 0.1
R1= 0. 1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
R 2= 0.1
.25 0.5 .75 .75 .75 .75 0.5 .25 0.1 0.1
0. 1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 .25 0.5 0.5 .5 0.5 0.5 0.5 .25 0.1 0.1
0. 1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 .25 .25 .25 .25 .25 .25 .25 .25 0.1 0.1
0. 1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0. 1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0. 1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
and R3= 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
.
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 .25 .25 .25 .25
0.1 0.1 0.1 0.1 0.1 0.1 0.1 .25 0.5 0.5 0.5
0.1 0.1 0.1 0.1 0.1 0.1 0.1 .25 0.5 .75 .75
0.1 0.1 0.1 0.1 0.1 0.1 0.1 .25 0.5 .75 1
Now, let the observed antecedent for 80 % happiness (µ HIGH) is given as B′=[0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.25 0.5 0.75 1]. Therefore, the fuzzy inference is
A′ = (B′ o Q1) ∪ (B′ o Q2) ∪ (B′ o Q3) = [0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.25 0.5 0.75 1].
Facial Expression Synthesis fo
or a Desired Degree of Emotion Using Fuzzy Abduction 391
3.4 Results
0.1 × 88 + 0.1 × 90 + 0.1 × 92 + 0.1 × 94 + 0.1 × 96 + 0.1 × 98 + 0.1 × 100 + 0.25 × 102 + 0.5 × 104 + 0.75 × 106 + 1 × 108
c=
0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.25 + 0.5 + 0.75 + 1
= 103.375
9.png
4 Conclusions
References
1. Arnould, T., et al.: Backw
ward-chaining with fuzzy “if... then...” rules. In: Proc. 2nd IE
EEE
Inter. Conf. Fuzzy System
ms, pp. 548–553 (1993)
2. Arnould, T., Tano, S.: Interval-valued
I fuzzy backward reasoning. IEEE Trans. Fuuzzy
Systems 3(4), 425–437 (1995)
3. El Ayeb, B., et al.: A Neew Diagnosis Approach by Deduction and Abduction. In: P Proc.
Int’l Workshop Expert Sy ystems in Eng. (1990)
392 S. Chakraborty et al.
4. Bhatnagar, R., Kanal, L.N.: Structural and Probabilistic Knowledge for Abductive
Reasoning. IEEE Trans. on Pattern Analysis and machine Intelligence 15(3), 233–245
(1993)
5. Bylander, T., et al.: The computational complexity of abduction. Artificial Intelligence 49,
25–60 (1991)
6. de Campos, L.M., Gámez, J.A., Moral, S.: Partial Abductive Inference in Bayesian Belief
Networks—An Evolutionary Computation Approach by Using Problem-Specific Genetic
Operators. IEEE Transactions on Evolutionary Computation 6(2) (April 2002)
7. Chakraborty, S., Konar, A., Jain, L.C.: An efficient algorithm to computing Max-Min
inverse fuzzy relation for Abductive reasoning. IEEE Trans. on SMC-A, 158–169 (January
2010)
8. Charniak, E., Shimony, S.E.: Probalilistic Semantics for Cost Based Abduction. In: Proc.,
AAAI 1990, pp. 106–111 (1990)
9. Hobbs, J.R.: An Integrated Abductive Framework for Discourse Interpretation. In:
Proceedings of the Spring Symposium on Abduction, Stanford, California (March 1990)
10. Pedrycz, W.: Inverse Problem in Fuzzy Relational Equations. Fuzzy Sets and Systems 36,
277–291 (1990)
11. Peng, Y., Reggia, J.A.: Abductive Inference Models for Diagnostic Problem-Solving.
Springer-Verlag New York Inc. (1990)
12. Saha, P., Konar, A.: A heuristic algorithm for computing the max-min inverse fuzzy
relation. Int. J. of Approximate Reasoning 30, 131–147 (2002)
13. Yamada, K., Mukaidono, M.: Fuzzy Abduction Based on Lukasiewicz Infinite-valued
Logic and Its Approximate Solutions. In: FUZZ IEEE/IFES, pp. 343–350 (March 1995)
14. Petrantonakis, P.C., Hadjileontiadis, L.J.: Emotion Recognition from EEG Using Higher
Order Crossings. IEEE Transactions on Information Technology in Biomedicine 14(2)
(March 2010)
15. Klir, G.J., Yuan, B.: Approximate reasoning: Fuzzy sets and fuzzy Logic (2002)
16. Chakraborty, A., Konar, A., Pal, N.R., Jain, L.C.: Extending the Contraposition Property
of Propositional Logic for Fuzzy Abduction. IEEE Transaction on Fuzzy Systems 21, 719–
734 (2013)
17. Konar, A.: Artificial Intelligence and Soft Computing. CRC Press LLC (2000)
A Novel Semantic Similarity Based Technique
for Computer Assisted Automatic Evaluation of Textual
Answers
1
Udit Kr. Chakraborty , Samir Roy2, and Sankhayan Choudhury3
1
Department of Computer Science & Engineering
Sikkim Manipal Institute of Technology, Sikkim
2
Department of Computer Science & Engineering
National Institute of Technical Teachers’ Training & Research, Kolkata
3
Department of Computer Science & Engineering
University of Calcutta, Kolkata
{udit.kc,samir.cst,sankhayan}@gmail.com
1 Introduction
Evaluation is an important and critical part of the learning process. The evaluation of
learners’ response decides not only the amount of knowledge gathered by the learner
but also contributes towards refinement of the learning process. The task requires the
evaluator to have the required knowledge and also to be impartial, benevolent and
intelligent. However, all of these qualities may not always be present in human
evaluators, who are also prone to fatigue. Reasons similar to these and the
requirement of performing the task of evaluation on a larger scale necessitates the
implementation of auto-mated systems for evaluation of the learners’ response. Such
mechanized processes would not only be free from fatigue and partiality but also be
able to evaluate across geographical distances if implemented in e-Learning systems
whose importance, popularity and penetration is on the rise.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 393
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_46, © Springer International Publishing Switzerland 2014
394 U.Kr. Chakraborty, S. Roy, and S. Choudhury
However, the task of machine evaluation is easier said than done for reasons of
complexity in natural languages and the lack of our abilities in understanding them.
These reasons have given rise to the popularity of other types of assessment
techniques namely multiple choice questions, order matching, fill in the blanks etc.,
which in spite of their own roles are not fully reliable for evaluation of fulfillment of
learning outcomes. Whether the efficacy of questions requiring free text responses are
more than the other types is debatable, but it is beyond contention that free text
responses test the learners’ ability to explain, deduce and logically derive apart from
other parameters which are not brought out by the other types of question answering
systems.
The problem with the evaluation of free text responses lie in the variation in
answering and evaluation. Since the learners’ response is presented in his unique style
and words, the same answer can be written in different ways due to the richness in
form and structure of natural languages. Another problem is in the score assigned to
the answer since the score assigned can vary from one individual to another.
The computational challenge imposed by this task is immense, because, to
determine the degree of correctness of the response the meaning of the sentence has to
be extracted. The semantic similarity which is a means of finding the relation that
exists between the meaning of the words and meaning of sentences also needs to be
considered.
The work presented in this paper proposes an automated system that evaluates the
free text responses of the learner. The approach is in deviation from the currently
existing techniques in a few areas and considers not only important keywords but also
the words before and after them. Unlike n-grams technique, the number of words
before and after a keyword is not fixed and varies depending on the occurrence of the
next keyword. The current work is limited to single sentence responses only.
2 Previous Work
However, a large scale acceptance of these systems, have not yet taken place and a
complete replacement of the human evaluator is still a long distance away.
3 Proposed Methodology
The Answer Evaluation (AE) Module consists of two parts, one for the teacher and
other for the learner. The role of the teacher would be to fix the model answer for a
given question and fix the parameters of evaluation. This is similar to preparing a
solution scheme of evaluation by the teacher which is referred to while actually
evaluating the responses written by learners. The learners merely use the AE module
to type in there textual responses to the questions presented.
The task chalked out for the AE module can be stated as:
Given a question Q, its model text based answer MA and a learner response SA, the
AE module should be able to evaluate SA on a scale of [0, 1] with respect to MA.
• If the SA is completely invalid or contradictory to MA, then it is an incorrect
response and a value 0 is returned.
• If the SA is exactly same as the MA or is a paraphrase of the MA, or is a complete
semantic match, then it is a correct response and a value 1 is returned.
• If the SA is non-contradictory and is a partial semantic match for MA, then the
response is partially correct and a value greater than 0 and less than 1 is returned
depending upon the match.
This work is built upon the understanding that, an answer to a question is a
collection of keywords and their associated pre and post expressions which augment
sense to the keywords in the context of the question and also establishes links
between them. Unlike the nugget approach, which considers only keywords as the
building blocks, the current approach considers the preceding and following sets of
words as well. The choice of pre and post-expression is not based on the popular n-
gram technique but the occurrence of the next keyword. It is also worth mention that
unlike other natural language processing approaches, we do not remove the stop
words from a response as we consider these to be important information carriers.
Fig.1. presents the idea of how the answers are perceived by the model.
Since the system deals with natural language answers, we do not decide or attach
any weight to the order in which the key-words appear while evaluating the
responses. Also, it may be possible that a particular part of the response acts as post-
expression for a keyword and pre-expression for the next keyword, in which case it
will be considered twice depending upon the solution scheme presented by the
teacher.
396 U.Kr. Chakraborty, S. Roy, and S. Choudhury
Each pre-expression and post-expression is again broken up into four parts, namely
logic, certainty, count and part-of, which are the expected types of senses that these
expressions attach to the keywords. There is however, no fixed order in which the
words belonging to any of these categories would appear and it is also possible that
they do not appear at all. Lists of words have been pre-pared to be belonging to each
of these four categories and they are as shown in Table 1, 2, 3 and 4 respectively.
Table 1. Logic Expressions and their associated logic
S.No. Logic Expressions Logic
1 and Conjunction
2 or Disjunction
3 either-or Exclusive Disjunction
4 only if Implication
5 if and only if Equivalence
6 just in case Bi-conditional
7 not both Alternative Denial
8 neither – nor Joint Denial
not
9 it is false that Negation
it is not the case that
10 is Equality
Prior to the evaluation process, the teacher has to perform the following pre-
processing tasks to prepare the system to be able to evaluate the learners’ response.
This comprises of typing in the model response, identifying the model phrase from
the complete response, identifying the keywords, the post and pre expressions for
each key word and the categorization of the words in post and pre expressions into
their rightful sense conveying types.
The model answer is the answer prepared by the human evaluator and presents the
benchmark against which the learners' response would be evaluated. This answer
consists of a central part, which we call the model phrase, MP, and represents the core
of the answer.
To ensure that the reproduction of your illustrations is of a reasonable quality, we
advise against the use of shading. The contrast should be as pronounced as possible.
If screenshots are necessary, please make sure that you are happy with the print
quality before you send the files.
The steps are listed below in the order of their occurrence:
Step 1: The model answer MA is created.
Step 2: A model phrase MP is identified within the model answer MA.
Step 3: Keywords are identified and listed.
Step 4: All KW are marked with their associated part of speech.
Step 5: For each KW:
Step 5.1: Synonyms having same POS usage are listed.
Step 6: Weights are associated to each KW depending on importance and
relevance. Sum of all weights to be equal to 1.
Step 7: For every KW the pre-expression and the post-expression are extracted
and words/phrases put in their respective sense brackets, i.e., logic, certainty,
count or part-of.
Once the pre-processing is done, the system is ready to read in the learners’ response
and evaluate it. The aim is to evaluate the response and return a score in the range of 0
to 1. The algorithm for performing the same is as follows:
Algorithm Eval_Response:
Evaluates the learners' text based response.
Variables:
SA (Learner's Response), MA(Model Answer), MP (Model Phrase), KW (Keyword), PrE(Pre-
expression), PoE (Post-expression), KW_S(Score from a keyword), KW_Weight (Weight of a
keyword), PrE_S(Score of a pre-expression), PoE_S(Score of a post-expression), Marks(Total
marks).
The methodology discussed in the previous sections was employed to test the
correctness in comparison to a human evaluator. While performing the tests, we
considered single sentences responses only. The human evaluators were kept unaware
of the method to be employed by the automated system; however, since the automatic
evaluation would return fractional values between 0 and 1, human evaluators were
Computer Assisted Automatic Evaluation of Textual Answers 399
asked to score up to 2 decimal places. Two such tests and their details are presented
here along with some findings on the results.
4.1 Set 1
The question was presented to 39 learners’, and the responses evaluated by the
automated system and also by human evaluators, based on the model response
specified and shown in Table 5. The correlation co-efficient between the two
evaluators was calculated and found to be equal to 0.6324, with 30% cases having
difference not more than 10%.
4.2 Set 2
Question: What is the advantage of representing data in AVL search tree than to
represent data in binary search tree?
Model Answer: In AVL search tree, the time required for performing operations like
searching or traversing is short. e.g. worst case complexity for searching in BST
(O(n)) worst case complexity for searching in AVL search tree (log(n)).
Model Phrase: In AVL search tree, the time required for performing operations like
searching or traversing is short.
This question was asked to a group of 50 learners’ and the responses similarly
evaluated with model response as in Table 6. The results found returned a correlation
co-efficient of 0.6919, with 68% instances of not more than 10% difference in the
marks allotted by the system and the human evaluator.
It is observed that the performance of the system tends to improve on increasing
the volume of the response. However, the in-crease in the volume of the response in
this case also meant an increase in the number of keywords. As a matter of fact, both
tests were conducted on keyword heavy samples. Whether the performance would
change on having heavier pre and post expressions is yet to be explored.
- - - - tree 0.005 - - - -
- - - - (log(n)) 0.7 - - - -
5 Conclusion
The work is aimed at developing a novel system that is capable of evaluating the free
text responses of learners’. Unlike the widely followed bag-of-words approach, the
work mentioned here takes positional expressions, keyword and even stop words into
consideration during evaluation. The proposed method generates a fuzzy score taking
into consideration all mentioned criteria. The score generated by the system does not
deviate too much from the human evaluator and further tests may produce still better
results.
References
1. Lin, J., Fushman, D.D.: Automatically Evaluating Answers to Definition Questions. In:
Proceedings of Human Language Technology Conference and Conference on Empirical
Methods in Natural Language Processing (HLT/EMNLP), pp. 931–938 (2005)
Computer Assisted Automatic Evaluation of Textual Answers 401
2. Foltz, P.W., Laham, D., Landauer, T.K.: The Intelligent Essay Assessor: Applications to
Educational Technology. Interactive Multimedia Education Journal of Computer Enhanced
Learning 1(2) (1991)
3. Landauer, T.K., Foltz, P.W., Laham, D.: An Introduction to Latent Semantic Analysis.
Discourse Processes 25(2&3), 259–284 (1998)
4. Dessus, P., Lemaire, B., Vernier, A.: Free-text Assessment in a Virtual Campus. In:
Proceedings of the 3rd International Conference on Human-Learning Systems, pp. 2–14
(2000)
5. Perez, D., Alfonseca, E.: Adapting the Automatic Assessment of Free-Text Answers to the
Learners. In: Proceedings of the 9th International Computer-Assisted Assessment (CAA)
Conference (2005)
6. Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a Method for Automatic Evaluation of
Machine Translation. In: Proceedings of the 40th Annual Meeting on Association for
Computational Linguistics, pp. 311–318 (2002)
7. Leacock, C., Chodorow, M.: C-rater: Automatic Content Scoring for Short Constructed
Responses. In: Proceedings of the 22nd International FlAIRS Conference, pp. 290–295
(2009)
8. Mitchell, T., Russell, T., Broomhead, P., Aldridge, N.: Towards Robust Computerized
Marking of Free-Text Responses. In: Proceedings of 6th International Computer Aided
Assessment Conference, Loughborough (2002)
Representative Based Document Clustering
1 Introduction
Document clustering is an unsupervised document organization method that
put documents into different groups called clusters, where the documents in each
cluster share some common properties according to a defined similarity measure.
In most document clustering models similarity between documents is measured
on basis of matching with single words rather than matching with phrases. The
motivation of this paper is to bring the effectiveness of phrase based matching in
document clustering. Work related to phrase based document clustering that has
been reported in literature is limited. Zamir et al. [3][4] proposed an incremental
linear time algorithm called Suffix Tree Clustering (STC), which creates clusters
based on phrases shared between documents. They claim to achieve nlog(n)
performance and produce high quality clusters. Hammouda et al [5] proposed a
document index model which implements a Document Index Graph that allows
incremental construction of a phrase-based index of the document set and uses
an incremental document clustering algorithm to cluster the document set.
In this paper we introduce a representative based document clustering model.
The model involves three main phases: sequence segmentation, document rep-
resentation, and clustering. Let D be a set of documents containing documents
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 403
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_47, c Springer International Publishing Switzerland 2014
404 A. Banerjee and A.K. Pujari
is represented by a node in the tree. The right and left frequency fields represent
frequencies of the chunk and the reverse chunk, respectively. For example, the
chunk {a b} and the reverse chunk {b a} occurs twice. The reverse chunk {a c}
occurs once but the chunk {a c} does not occur. Let a node nk represents the
frequencies of a chunk and a reverse chunk in S by fk1 and fk2 , respectively. If nk
has m children, then they would be denoted by nk,1 , ..., nk,m having frequencies
t t
fk,1 , ..., fki,m , respectively, where t = 1, 2.
and 1 < j ≤ N . Here the F ind T rie N ode function returns the node of the TRIE
t
that represents wi,j . We will utilize the said scores to predict word boundaries
in our segmenation algorithm described in section 4. In the following we derive
the frequency information of chunks in the sequence to perform segmentation.
We make an effort to determine the boundary between two consecutive chunks
by comparing frequencies of the chunks with subsequences that straddle the
common boundary. We explain the situation using the following example. Fig.
2 shows a sequence of ten alphabets W O R D 1 W O R D 2 that represents
two consecutive words WORD1 and WORD2. The goal is to determine whether
there should be a word boundary between ”1” and ”W”. Let us assume that
at an instance a window of length 10 contains the words and it considers a hy-
pothetical boundary in the middle that is between WORD1 and WORD2. To
check whether the boundary is a potential one we count the number of times
the two chunks (words WORD1 and WORD2) are more frequent than subse-
quences inside the window of same length that straddle the boundary. Fig. 2
shows four possible straddling subsequences of length five. For example, fre-
quency of one of the straddling subsequence 1WORD should be low compared
to WORD1 or WORD2, since WORD2 has less chance of occurring just after
WORD1. We maintain a score that is incremented by one only if WORD1 or
WORD2 are more frequent than that of 1WORD. So, if both possible chunks
are higher in frequency than a straddling subsequence, the score to the hypo-
thetical boundary location is incremented twice. In our algorithm we consider
a window of length 2n, where n varies from 2 to the maximum length W . A
window of length 2n has its hypothetical boundary separating alphabets at nth
and (n + 1)th positions, hence number of straddling subsequences across the
boundary would be (n − 1). Therefore, by comparing all (n − 1) straddling sub-
sequences with both chunks inside the right and left windows the boundary
location receives total 2(n − 1) scores.To normalize the scores of different win-
dow length we average the scores by dividing it with 2(n − 1). The final score at
each location is then the sum of all (n − 1) average scores contributed by chunks
inside the window having length of 2 to W . The final score at each location
of the sequence, which we call the frequency score, measures the potentiality of
the location to be a word boundary and is utilized by segmentation algorithm
in section 4. Let the frequency score at j th position in S using a window of
length 2n(2 ≤ n ≤ W ) is denoted by Fj . If a straddling sequence starts in-
side the window from the k th position in S, then Fj can be written as Fj ←
W j
n=2 ( k=j−n+2 (δF req(F ind T rie N ode(wk,k+n−1
1 1
))<F req(F ind T rie N ode(wj−n+1,j )) +
would contain many consecutive actual words. In that case, like all straddling
subsequences across the boundary, the consecutive words in the left and right
windows would occur mostly once in the sequence. So they would contribute
almost nothing to the boundary score. As window length is increased after a
certain length, the contribution becomes mostly zero, which is a kind of getting
rid of window-size parameter. In the following section we explain the segmen-
tation algorithm that segments an input sequence using boundary entropy and
frequency scores.
A Toy Example
Input(set of documents formed from the input of section 3): Miller was close
to the mark when he; compared bits with segments. But when he; compared
segments with pages he was not; close to the mark. his being close to; the mark
means that he is very close to; the goal. segments may be identified by an;
information theoretic signatures. page may; be identified by storage properties.
Bit; may be identified by its image.;
Representatives found after chunking by CSEF:
close; to; the; mark; when; he; segments; page; may; beid;
Given K=3, Output of K-means:
Cluster1: compared bits with segments. But when he; compared segments with
pages he was not; Cluster 2: the goal. segments may be identified by an; be
identified by storage properties. bit; may be identified by its image; information
theoretic signatures. page may;Cluster 3: Miller was close to the mark when he;
close to the mark. his being close to; the mark means that he is very close to;
In the following experimental section we produce the performance of our clus-
tering method on some large datasets.
5 Experiments
In this section, we present an empirical evaluation of our representative based
document clustering method in comparison with some other existing algorithms
on a number of benchmark data sets[11][10].The results of only four document
datasets are recorded as for most of other datasets we got almost similar results.
The first five columns of Table 1 summarize the basic properties of the data sets.
To compare the performances of CSEF with VE, F1 score[13] is used to determine
quality of segmentation from both methods. The F1 score is the harmonic mean
of precision and recall and reaches its best value at 1 and worst score at 0. In
the 6th and 7th column of Table 1 F1 scores of VE and CSEF are recorded,
respectively. The results show that CSEF achieves better performance on all the
four data sets.
The datasets considered in the experiment have their class labels known to
the evaluation process. To measure the accuracy of class structure recovery by a
clustering algorithm we use a popular external validity measure called Normal-
ized Mutual Information (NMI) [12]. The value of NMI equals 1 if two clusterings
410 A. Banerjee and A.K. Pujari
are identical and is close to 0 if one is random with respect to the other. Thus
larger values of NMI indicate better clustering performance. We have chosen
K-means and Cluto, by Ying Zhao et al. [9], as our document clustering al-
gorithms. In table 2 CSEF-Cluto and CSEF-Kmeans denote our representative
based clustering model with Cluto and Kmeans, respectively. In most of the cases
number of representatives chosen by CSEF is half the number of the words in
the document set. We compare our results with four other document clustering
algorithms namely, Clustering via local Regression (CLOR)[6], Spectral cluster-
ing with normalized cut (NCUT)[7], Local learning based Clustering Algorithm
(LLCA1) and its variant (LLCA2)[8]. The NMI performances of the four algo-
rithms are taken from the paper ”Clustering Via Local Regression, by Jun Sun
et al. [6] and are verified. The output of the algorithms depends on a parameter
k, which they have mentioned as neighbourhood size. By drawing NMI/k graphs
they have provided the outcomes that we recorded numerically in Table 2. max
and avg in Table 2 means maximum and average NMI values attended in the
range of k=5 to 120, respectively. For re0 data, only the best result they have
recorded which happened for k=30.
Table 2 shows that K-means does not give satisfactory results when applied
alone but with CSEF quality of results improve. Generally in practice, Cluto
produces very good results compared to other algorithms. But it performs even
better when associated with CSEF, which demonstrates the effectiveness of rep-
resentative based similarity approach to document clustering. Here we see that
CSEF-Cluto outperforms other algorithms for most of the datasets.
References
1. Cohen, P., Adams, N., Heeringa, B.: Voting experts: An unsupervised algorithm
for segmenting sequences. Journal of Intelligent Data Analysis (2006)
2. Hewlett, D., Cohen, P.: Bootstrap Voting Experts. In: IJCAI, pp. 1071–1076 (2009)
3. Zamir, O., Etzioni, O.: Web Document Clustering: A Feasibility Demonstration.
In: Proc. 21st Ann. Int’l ACM SIGIR Conf., pp. 45–54 (1998)
4. Zamir, O., Etzioni, O.: Grouper: A Dynamic Clustering Interface to Web Search
Results. Computer Networks 31(11-16), 1361–1374 (1999)
5. Hammouda, K., Kamel, M.: Efficient Phrase-Based Document Indexing for Web
Document Clustering. IEEE Trans. Knowl. Data Eng. 16(10), 1279–1296 (2004)
6. Sun, J., Shen, Z., Li, H., Shen, Y.: Clustering Via Local Regression. In: Daelemans,
W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI),
vol. 5212, pp. 456–471. Springer, Heidelberg (2008)
7. Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Transactions
on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
8. Wu, M., Scholkopf, B.: A local learning Approach for Clustering. In: Advances in
Neural Information Processing Systems, vol. 19 (2006)
9. Zhao, Y., Karypis, G.: Empirical and Theoretical Comparisons of Selected Criterion
Functions for Document Clustering. Machine Learning 55, 311–331 (2004)
10. Lewis, D.D.: Reuters-21578 text categorization test collection,
http://www.daviddlewis.com/resources/testcollections/reuters21578
11. TREC: Text REtrieval Conference, http://trec.nist.gov
12. Strehl, A., Ghosh, J.: Cluster Ensembles - A Knowledge Reuse Framework for
Combining Multiple Partitions. Journal of Machine Learning Research 3, 583–617
(2002)
13. Van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Dept. of Computer Science,
University of Glasgow (1979)
A New Parallel Thinning Algorithm with Stroke
Correction for Odia Characters
Abstract. There are several thinning algorithms reported in literature in last few
decades. Odia has structurally different script than that of other Indian lan-
guages. In this paper, some major thinning algorithms are examined to study
their suitability to skeletonize Odia character set. It is shown that these algo-
rithms exhibit some deficiencies and vital features of the character are not re-
tained in the process. A new parallel thinning technique is proposed that pre-
serves important features of the script. Interestingly, the new algorithm exhibits
stroke-preservation which is a higher level requirement of thinning algorithms.
Present work also discusses a concept of stroke correction where a basic stroke
is learnt from the original image and embedded on the skeleton.
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 413
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_48, © Springer International Publishing Switzerland 2014
414 A.K. Pujari, C. Mitra, and S. Mishra
is better than that of ZS algorithm for diagonal stroke at top left portion of dhYa,
(Figure 5e). CWSI87[2] algorithm proposed by Chen et al [2] uses 4×4 window and it
has one restoring mask in addition to the thinning mask. A pixel is deletable if it
matches with any thinning mask and does not match the restoring mask. This algo-
rithm retains the circular shapes but cannot remove the contour noise point for Odia
characters. For the character, Nka (ଙ୍କ), we notice (Figure 8a) that the small circular
shape at top-right corner is preserved. Readers may compare the output with KGK
algorithm for the same character where the circular shape is more of a rectangular
shape (Figure 5a). But on the other hand CWSI algorithm is unable to retain straight
line strokes. CWSI algorithm suffers from many other shortcomings such as contour
noise, stroke-end bifurcation, retention of vertical lines and joint distortion (Figure
8b). H89 [6] algorithm preserves the horizontal, vertical lines properly. But it fails to
remove noise points and to retain curvilinear strokes. Circular strokes, predominant in
Odia script, are not retained. In GH92 [6] curves and lines are maintained, but closeby
junction points are merged. It does not guarantee unit width skeleton (Figure 6d).
CCS95 [3] algorithm uses 5 different thinning masks and 10 restoring masks. In this
method, the skeleton is not guaranteed to be connected for DhYa (ଧ୍ୟ) and NTha (ଣ୍ଠ)
(Figure 5a and 4a). BM99 [1] uses two different types of thinning masks and one
restoring mask. It is unable to find skeleton of unit width and cannot retain vertical
strokes with joints (Figure 6a). Recently, a robust algorithm [10] is proposed which
performs better than ZS and KGK algorithm. However, we notice that its performance
is not satisfactory for Odia character. For example, small line segments in the charac-
ter get deleted (Figure 6a). A small diagonal stroke at bottom left corner gets smooth-
en. There is extra noise along contour as seen in Figure 13b. For the character ma and
sa, small top portion and bottom portion of the straight line are either deleted or
curve.
In this section, a new thinning technique is proposed which blends the strengths of
aforementioned thinning algorithms. It is observed that some of the existing algo-
rithms emphasize on topology preservation, some on connectivity and some on single
pixel width skeleton. Many of the existing algorithms do have advantages that can be
blended together to generate a new method of thinning. But when considered in isola-
tion, none of these is able to yield satisfactory results for Odia script. The steps are
sequenced carefully to ensure that algorithms that are able to delete contour noise are
used first. Similarly, those which can restore slant-lines are used later. The proposed
method has the following major components. In its first k iterations, it uses two
subiterations of ZS Algorithm. Two subiterations are used only for fixed number of
steps and then to switch over to apply thinning and restoration masks of CCS95 Algo-
rithm. This step terminates when there is no deletable pixel. After completion of this
step, the algorithm applies the postprocessing step of KGK Algorithm. The pseudo
code of the algorithm is given below.
416 A.K. Pujari, C. Mitra, and S. Mishra
4 Stroke Corrections
Our experimental study show that the new algorithm preserves topolgy, connected-
ness, shape, strokes and other desirable features of Odia character. The algorithm is
able to skeletonize some important strokes to near-perfect accuracy. For instance,
vertical lines can be preserved but over-erosion at the joints cannot be fully avoided.
We propose a novel idea of stroke correction stage. If the thinning algorithm retains
the shape of the stroke to a unambiguous state, then the stroke can be recognized by a
separate process and can be replaced by a perfect stroke. For instance, if the thinning
algorithm returns a vertical line eroded at junction points, we introduce a separate
method to recognize the straight line and then replace the eroded line with a perfect
line.
5 Experimental Results
and hence k =1. In Figure 3-7, comparison of the proposed method with CCS, CWSI,
KGK, ZS and ZW algorithms for 5 typical characters, namely sa (ସ), E(ଏ), NTha(ଣ୍ଠ),
Nka(ଙ୍କ), and dhYa(ଧ୍ୟ) are reported in Figures 3-5. Most of the characters chosen
for illustration contain typical strokes and are compound characters. These algorithms
perform better than others for other scripts. For the sake of illustration, Figure 6 gives
the output of the proposed algorithm and of BM99, H89, Robust, GH92 and LW for
the character ma(ମ) and aw(ଅ). It is evident from Figures 4-5 that the end points and
junction points are either shrunk or deleted by CWSI, KGK and ZW. CCS algorithm
does not preserve connectivity and ZS algorithm yields 2-pixel width skeleton for
slanting lines.
Fig. 2. Skeleton of sa (ସ) by thinning algorithms (a) CCS (b) CWSI (c) KGK (d) ZS (e) ZW (f)
LW (g) H89 (h) GH92 (i) BM99 (j) robust and (k) proposed algorithm. The diagonal line,
vertical line and junction points have been restored distinctly by the proposed method.
Fig. 3. Skeleton of e (ଏ) by algorithms (a) CCS (b) CWSI (c) KGK (d) ZS (e) ZW and (f)
proposed algorithm
418 A.K. Pujari, C. Mitra, and S. Mishra
Fig. 4-5. Skeleton of NTha (ଣ୍ଠ) and Nka (ଙ୍କ) by thinning algorithms (a) CCS (b) CWSI (c)
KGK (d) ZS (e) ZW and (f) proposed algorithm. Note that circular shapes, connectedness and
short linear strokes are retained correctly by the proposed algorithm
Fig. 6. Result of algorithms (a) CCS (b) CWSI (c) KGK (d) ZS (e) ZW (f) proposed algorithm
Fig. 7-8. Result of algorithms (a) BM99 (b) H89 (c) Robust (d) GH92 (e) LW and (f) proposed
algorithm. Note that many outputs are 2-pixel width skeletons.
From our extensive experimentation, we observe that CCS, CWSI and ZW are not
able to remove contour noise. BM99, LW and GH92 yield skeletons of 2-pixel width.
H89 tends to remove tips of linear strokes. Robust algorithm often fails to preserve
shape and topology. CWSI does not preserve vertical or diagonal lines and sensitive
to contour noise points. CCS overcomes this shortcomings but does not retain con-
nectedness. KGK often yields 2-pixel width skeleton. LW does not preserve junction
or end points. Robust algorithm is also sensitive to contour noise.
6 Conclusions
The present work is concerned with investigating existing thinning algorithms for
their suitability to a specific script, namely Odia script. It is observed that none of the
A New Parallel Thinning Algorithm with Stroke Correction for Odia Characters 419
existing algorithms is very suitable and has some shortcoming or other. By combining
some steps of different algorithms, a new algorithm is proposed that not only pre-
serves the desirable features like shape, topology, connectivity etc but also retains
basic strokes. This is very useful for subsequent recognition process. A new concept
of stroke correction is also introduced here. We propose to integrate this with our
recognizer in future and extend the concept of stroke correction to many other basic
strokes.
References
1. Bernard, T.M., Manzanera, A.: Improved low complexity fully parallel thinning algorithm.
In: ICIAP (1999)
2. Chin, R.T., Wan, H.K., Stover, D.L., Iversion, R.D.: A one-pass thinning algorithm and its
parallel implementation. CVGIP 40, 30–40 (1987)
3. Choy, S.S.O., Choy, C.S.T., Siu, W.C.: New single-pass algorithm for parallel thinning.
CVIU 62, 69–77 (1995)
4. Couprie, M.: Note on fifteen 2-D parallel thinning algorithm. In: IGM 2001, pp. 1–21
(2006)
5. Datta, A., Parui, S.K.: A robust parallel thinning algorithm for binary images. Pattern
Recognition 27, 1181–1192 (1994)
6. Guo, Z., Hall, R.W.: Fast fully parallel thinning algorithm. CVGIP 55, 317–328 (1992)
7. Hall, R.W.: Fast parallel thinning algorithm: parallel speed and connectivity preservation.
CACM 32, 124–131 (1989)
8. Kong, T.Y., Rosenfeld, A.: Digital topology: introduction and survey. CVGIP 48, 357–393
(1989)
9. Kwon, J.S., Gi, J.W., Kang, E.K.: An enhanced thinning algorithm using parallel pro-
cessing. In: ICIP, vol. 3, pp. 752–755 (2001)
10. Lam, L., Suen, C.Y.: An evaluation of parallel thinning algorithm for character recogni-
tion. IEEE Tr. PAMI 17, 914–919 (1995)
11. Negi, A., Bhagvati, C., Krishna, B.: An OCR system for Telugu. In: ICDAR 2001, pp.
1110–1114 (2001)
12. Pal, U., Chaudhuri, B.B.: Indian script character recognition-A survey. Pattern Recogni-
tion 37, 1887–1899 (2004)
13. Pujari, A.K., Naidu, C.D., Jinga, B.C.: An adaptive and intelligent character recognizer for
Telugu scripts using multiresolution analysis and associative measures. In: ICVGIP (2002)
14. Lu, H.E., Wang, P.S.P.: An improved fast parallel algorithm for thinning digital pattern.
In: IEEE CVPR 1985, pp. 364–367 (1985)
15. Tarabek, P.: A robust parallel thinning algorithm for pattern recognition. In: SACI, pp. 75–
79 (2012)
16. Wang, P.S.P., Hui, L.W., Fleming, T.: Further improved fast parallel thinning algorithm
for digital patterns. In: Computer Vision, Image Processing and Communications Systems
and Appln., pp. 37–40 (1986)
17. Zhang, T.Y., Suen, C.Y.: A fast parallel algorithm for thinning digital patterns. CACM 27,
236–239 (1984)
18. Zhang, Y.Y., Wang, P.S.P.: A modified parallel thinning algorithm. In: CPR, pp. 1023–
1025 (1988)
Evaluation of Collaborative Filtering Based on Tagging
with Diffusion Similarity Using Gradual Decay Approach
Abstract. The growth of the Internet has made difficult to extract useful
information from all the available online information. The great amount of data
necessitates mechanisms for efficient information filtering. One of the
techniques used for dealing with this problem is called collaborative filtering.
However, enormous success of CF with tagging accuracy, cold start user and
sparsity are still major challenges with increasing number of users in CF.
Frequently user’s interest and preferences drift with time. In this paper, we
address a problem collaborative filtering based on tagging, which tracks user
interests over time in order to make timely recommendations with diffusion
similarity using gradual decay approach.
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 421
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_49, © Springer International Publishing Switzerland 2014
422 L. Banda and K.K. Bharadwaj
B
colour to indicate numerical values. (ii) Text clouds are representation of w word
frequency in a given text. (iii) Collocate cloud examines the usage of a particuular
word instead summarizing an a entire document. Our work in this paper is evaluationn of
Collaborative Filtering bassed on tagging using Gradual decay approach. Firstly we
have generated the results of
o data sets using gradual decay approach with recent tiime
stamp and then the results of CF and CF with tagging were evaluated which giives
accurate and efficient resultts.
2 Related Work
Recently tagging has grown n in popularity on the web, on sites that allow users to tag
bookmarks, photographs an nd other content [usage patterns]. Tag is freely createdd or
chosen by a user in social taagging systems which may not have a structure.
Fig. 1. Collab
borative filtering based on tagging and Tag cloud
Tagging can be categorrized into two types: (i) Static tags or like/dislike tagss in
which a user post thumbs up u for like and thumbs down for dislike. (ii) Dynamic ttags
in which a user give his opiinion or comment on item. These are also called as popuular
tags. Collaborative filtering
g based on tagging is shown in Fig.1.
Evaluation of Collaborative Filtering Based on Tagging with Diffusion Similarity 423
Neighbourhood set is a group [1], which is having similar taste and preferences. In the
process of neighbourhood formation the size must be large to get the accurate results.
For this the various similarity methods can be used such as, Pearson correlation [2]
coefficient is given by Eq. 1.
∑ ∈S ( , ) ( , )
( , )= (1)
∑ ∈S ( , ) ∑ ∈S ( , )
The Eq. (1) is not appropriate because it select only common items for both users.
To compute similarity of multiple features Euclidian distance can be used
( , )= ∑N (x , y,) (2)
The other similarity measure is cosine similarity technique in which inverse user
frequency is applied, the similarity between users, u and v is measured by the
following equation:
∑ ∈ ( , . ) ( , . )
( , )= ( , )= , (3)
∑ ∈ ( , . ) ∑ ∈ ( , . )
Here ‘l’ is the total number of users in the system and nt, the number of users
tagging with tag t. iuft is the inverse user frequency for a tag t: iuft = (log(l/nt) [3].
Here to get accuracy [4] in neighbourhood formation we have used diffusion based
similarity [5]. In this if the user has rated any item then the value is set to 1,
otherwise set to 0. If the user has used any tag then the value is set to 1 otherwise set
to 0. The resource that an item gets from V reads:
r (4)
( )
= ∑ ∈ (5)
( ) ( )
= ∑ ∈ (6)
( ) ( )
424 L. Banda and K.K. Bharadwaj
Fig.2 illustrates our approach in three phases: (i) Timestamp and gradual decay
approach applied on collection of data. (ii) Collaborative tagging has been
implemented using diffusion similarity. (iii) Get results by using Prediction and
recommendation.
The timestamp have been applied on collection of datasets, it gives the data in a
sequence order. Here we use timestamp to get recent data so that more
recommendation can be made for the user. As time increase the importance of data is
going to be diminishes. So the recent data is more important rather than the old data.
Here we have used gradual decay approach to get recent data more. In this we have
given least weight to old data and more weight to recent data. Here the data we have
chosen from year 2002 to year 2013 (shown in Table.1). The details are given in
Fig.2.
4 Experiments
(i) The datasets have been rearranged using timestamp method and gradual
decay approach.
(ii) We have conducted experiments on collaborative filtering based on
tagging.
(iii) We have compared these approaches in terms of mean absolute error.
Evaluation of Collaborative Filtering Based on Tagging with Diffusion Similarity 425
Database
Phase 1
Data using timestamp
CFbased on Tagging
Phase 2
Neighborhood set generation
using diffusion similarity
Phase 3
Prediction and
Recommendation
4.1 Datasets
For every dataset we havee collected 5000 users data who have at least visitedd 40
items. For every dataset wee have divided 1000 users split. Such a random separattion
was intended for the executtions of one fold cross validation where all the experimeents
are repeated one time for evvery split of a user data. For each dataset we have testedd set
of 30% of all users.
To evaluate the effectiveeness of the recommender systems, the mean absolute errror
(MAE) computes the differrence between predictions generated by RS and ratingss of
the user. The MAE is given n by the following Eq. (7)
MAE (i) = ∑ , , . (7)
4.2 Results
In this experiment we hav ve run the proposed collaborative tagging with Diffussion
Similarity using Gradual Decay
D Approach and compared its results with classical
Collaborative filtering and collaborative filtering based on collaborative tagging. In
this the active users[7] are considered from 1 to 20. Based on these active users, we
have computed the MAE and a prediction percentage of CF (Collaborative Filterinng),
CFT (Collaborative Filterring based on Tagging), and CFT-DGD (Collaborattive
w diffusion similarity using gradual decay approach) and
filtering based on tagging with
the results show that CFT--DGD performs better than the other methods. The ressults
are shown in split.1 and split.2
100
percentage of
predictions
50 CF
0
CFT-DGD
1 3 5 7 9 11 13 15 17 19
CFT
Active User
100
percentage of
predictions
50 CF
0 CFT-DGD
1 3 5 7 9 11 13 15 17 19 CFT
Active user
Table 2. Total MAE for CF,CFT and Table 3. Total correct predictions for CF,C
CFT
CFT-DGD and CFT-DGD
Split MAE MAE MAE Algo CorrectPred’s MAE
(CF) (CFT) (CFT- (%) (AVG)
DGD) CF 41.54 0.864
1 0.890 0.802 0.762
CFT 47.82 0.778
2 0.902 0.792 0.742
CFT- 54.28 0.728
3 0.801 0.770 0.682 GIM
5 Conclusion
We have proposed a framework collaborative filtering. Our approach targets the uuser
interests over time in order to make timely recommendations with diffusion similaarity
using gradual decay approaach. Experimental results show that our proposed scheeme
can significantly improve thhe accuracy of predictions.
References
1. Omahony, M.P., Hurley,, N.J., Silvestre, G.C.M.: An Evolution of Neighbourhhood
Formation on the Performaance of Collaborative Filtering. Journal of Artificial Intelligeence
Review 21(3-4), 215–228 (2004)
(
2. Adomavicius, G., Tuzhilin n, A.: Toward the next generation of recommender system ms A
survey of the state-of-the--art and possible extensions. Journal of IEEE Transactionss on
Knowledge and Data Engin neering 6(17), 734–749 (2005)
3. Nam, K.H., Ji, A.T., Ha, I., Jo, G.S.: Collaborative filtering based on collaborative taggging
for enhancing the qualitty of recommendation. Electronic Commerce Research and
Applications 9(1), 73–83 (2
2010)
4. Anand, D., Bharadwaj, K.K K.: Enhancing accuracy of recommender system through adapptive
o hybrid features. In: Nguyen, N.T., Le, M.T., Świątek, J. (eeds.)
similarity measures based on
ACIIDS 2010, Part II. LNCCS (LNAI), vol. 5991, pp. 1–10. Springer, Heidelberg (2010)
428 L. Banda and K.K. Bharadwaj
5. Shang, M.S., Zhang, Z.K., Zhou, T., Zhang, Y.C.: Collaborative filtering with diffusion-
based similarity on tripartite graphs. Journal of Physica A 389, 1259–1264 (2010)
6. Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Item based Collaborative Filtering
Recommendation Algorithms. In: Proceedings of the 10th International Conference on
World Wide Web, pp. 285–295 (2001)
7. Berkvosky, S., Eytani, Y., Kuflik, T., Ricc, F.: Enhancing privacy and preserving accuracy
of a distributed Collaborative Filtering. In: Proceedings of ACM Recommender Systems,
pp. 9–16 (2007)
Rule Based Schwa Deletion Algorithm for Text to Speech
Synthesis in Hindi
Abstract. This paper provides the solution of Schwa deletion while converting
grapheme into phoneme for Hindi language. Schwa is short neutral vowel and
its sound depends onto adjacent consonant. Schwa deletion is a significant
problem because while writing in Hindi every Schwa is followed by a conso-
nant but during pronunciation, not every Schwa followed by a consonant is pro-
nounced. In order to obtain good quality TTS system, it is necessary to identify
which Schwa should be deleted and which should be retained. In this paper we
provide various rules for Schwa deletion which are presently applicable to a
limited length of word and will try to further extend it in future.
1 Introduction
Speech is the general mean of the human communication except for the people who
are not able to speak. Person who is visually impaired is even possible to communi-
cate with other person with the help of Speech but is not able to communicate with
the computer because the interaction between person and computer is based on writ-
ten [2] text and images. So to overcome this problem speech synthesis is needed in
natural language. Speech synthesis is a process of converting text into natural speech
by using NLP and DSP, the two main component [1], [2] of TTS system. The NLP
part consists of tokenization, normalization; G2P conversion and DSP part consist of
waveform generation. Finding correct and natural pronunciation for a word is an im-
portant task and G2P is responsible for it. G2P accept word as an input and convert it
into its phonetic transcription later this transcription is used for waveform generation.
Hence it is necessary to generate correct phonetic transcription which is done by de-
leting Schwa from the word, when it is not pronounced. Our paper present an algo-
rithm which resolves problem of schwa and layout of the paper is as follows. Here,
we first introduce the Hindi writing system and Schwa problem then discussed the
terminology we have used during algorithm. In next sections, we have elaborated
algorithm and its result.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 429
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_50, © Springer International Publishing Switzerland 2014
430 S. Kabra, R. Agarwal, and N. Yadav
2 Terminology Used
This algorithm is performed on individual word as input. Here we divided a word into
different blocks on the basis of consonant(C) and vowel (V). Before discussing the
rules we discuss some terminology related to our approach.
• Block – All consonant followed by a vowel [6] binds in one block .Here, conso-
nant ∈ (belongs to) any consonant in English or (NULL).We have made follow-
ing algorithm for up to 4 block length only (refer Table. 1).
अपना a pa nA 3
4 Results
In this section we have presented the accuracy of above algorithm with the help of
graph. Above algorithm will not provide correct solution for compound words like
जनसंख्या( janasanKyA). Morphological analyzer[7-9] could be used to segregate
compound words into individual base word and then concatenate the results after
applying algorithm on individual base word. For example:
Compound Word :janasanKyA (grapheme)
Using Morphological analyzer to segregate compound word into base words and
apply algorithm to individual base words.
Baseword 1 : jana (grapheme) jan (phoneme)
434 S. Kabra, R. Agarwal, and N. Yadav
Fig.1.Shows the results of Rule based algorithm for Schwa deletion applied on up
to 4 block length word as stated above. Results are manually tested on1700 base word
shaving length up to 4 blocks only and got the overall accuracy of 96.47%. Further,
number of correct result and accuracy for each block size words is shown in Table2.
References
1. Wasala, A., Weerasinghe, R., Gamage, K.: Sinhala Grapheme-to-Phoneme Conversion and
Rules for Schwa Epenthesis. In: Proceedings of the COLING/ACL 2006 Main Conference
Poster Sessions, pp. 890–897 (2006)
2. Choudhury, M.: Rule-Based Grapheme to Phoneme Mapping for Hindi Speech Synthesis
In: 90th Indian Science Congress of the International Speech Communication Association
3. Data sheet for W-X notation, http://caltslab.uohyd.ernet.in/
wx-notation-pdf/Gujarati-wx-notation.pdf
4. Narasimhan, B., Sproat, R., Kiraz, G.: Schwa-Deletion in Hindi Text-to-Speech Synthesis.
International Journal Speech Technology 7(4), 319–333 (2004)
5. Singh, P., Lehal, G.S.: A Rule Based Schwa Deletion Algorithm for Punjabi TTS System.
In: Singh, C., Singh Lehal, G., Sengupta, J., Sharma, D.V., Goyal, V. (eds.) ICISIL 2011.
CCIS, vol. 139, pp. 98–103. Springer, Heidelberg (2011)
6. Choudhury, M., Basu, A.: A rule based algorithm for Schwa deletion in Hindi. In: Interna-
tional Conf. on Knowledge-Based Computer Systems, pp. 343–353 (2002)
7. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural
Language Processing, Computational Linguistics, and Speech Recognition. Pearson Educa-
tion (2000)
8. Dabre, R., Amberkar, A., Bhattachrya, P.: A way to them all:A compound word analyzer
for Marathi. In: ICON (2013)
9. Deepa, S.R., Bali, K., Ramakrishnan, A.G., Talukdar, P.P.: Automatic Generation of Com-
pound Word Lexicon for Hindi Speech Synthesis
Unsupervised Word Sense Disambiguation
for Automatic Essay Scoring
1 Introduction
With large classrooms, teachers often find it difficult to provide timely and high
quality feedback to students with text answers. With the advent of MOOC,
providing consistent evaluations and reporting results is an even greater chal-
lenge. Automated essay scoring (AES) systems have been shown to be consistent
with human scorers and have the potential to provide consistent evaluations and
immediate feedback to students [4].
A teacher evaluates student essays based on students’ understanding of the
topic, the writing style, grammatical and other syntactic errors. The scoring
models may vary based on the question, for example, in a science answer the
concepts may carry more weight and the grammatical errors may be less im-
portant while in a language essay grammar, spelling and syntactical errors may
be as important as the content. Hence, for each essay, AES learns the concepts
from learning materials and the teachers scoring model from previously scored
essays. Most AES systems today that use LSA consider the important terms as
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 437
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8 51, c Springer International Publishing Switzerland 2014
438 P. Nedungadi and H. Raj
a bag of words and cannot differentiate the meaning of the terms in the context
of the sentence. In order to improve the accuracy of AES, we incorporate un-
supervised word sense disambiguation as part of the pre-processing. In contrast
with conventional WSD approaches, we did not just take the senses of the target
word for scoring; but measure the similarity between gloss vectors of the target
word, and a context vector comprising the remaining words in the text fragment
containing the words to be disambiguated resulting in better WSD [3].
The rest of the paper is organized as follows: we first present existing ap-
proaches to AES. Then we discuss the system architecture of proposed AES
system that incorporates WSD. Next we train and test the system and compare
the grading accuracy of the AES system that incorporates WSD with the base
AES system. Finally, we show using weighted kappa scores that the proposed
model has a higher inter-rater agreement than the previous model.
2 Existing Systems
Automated essay scoring systems are very important research area in educational
system and may use NLP, Machine Learning and Statistical Methods to evaluate
text answers. In this section, we discuss existing essay scoring systems and word
sense disambiguation methods.
2.2 E-Rater
The basic method of E-Rater is similar to PEG. In addition, E-rater measures
semantic content by using a vector-space model. Document vector for the essay
to be graded are constructed and its cosine similarity is computed with all the
pre-graded essay vectors. The essay takes the score of the essay that it closely
matches. E-rater cannot detect humour, spelling errors or grammar. It evaluates
the content by comparing the essays under same score.
created for each ungraded essay, with cell values based on the terms (rows) from
the original matrix. The average similarity scores from a predetermined number
of sources that are most similar to this is used to score the essay.
In our previous work, we had described an automatic text evaluation and scoring
tool A-TEST that checks for surface features such as spelling errors and word
count and also uses LSA to find the latent meaning of text [5]. In this paper, we
discuss the enhancements to the existing system that incorporates word sense
disambiguation. Though LSA systems need a large number of sample documents
and have no explanatory power, they work well with AES systems [1].
Our AES system is designed to learn and grade essays automatically and
first learns important terms from the golden essays or the course materials.
Next it uses a set of pre-scored essays that have been given a grade by human
raters manually as the training set to create the scoring model. These essays
are pre-processed as a list of words or terms with stop-word removal, stemming,
lemmatizing, and tokenized.
Step1: Preprocess the training essay set (spelling correction, stop word, lemmatizing)
Step2: Extract the sense of each word in every sentence of the essay (WSD).
Step3: Generate the Term-by-document matrix using the output from WSD
Step4: Decompose A into U, V and (singular value decomposition)
440 P. Nedungadi and H. Raj
The output of the WSD process (Refer Fig. 1) is used by the next phase,
LSA to determine the correct of the word used in each sentence of essays [1]. A
sense-by-document matrix is created which represents the golden essays, followed
by LSA for dimensionality reduction.The document vectors of the pre-scored
training essays are created after the pre-processing and WSD process and then
compared with the reduced matrix to find the similarity score of the best match.
A scoring model is derived using the similarity score, the spelling error and the
word count of the essay.
We used the Kaggle dataset to test the new AES with WSD and compared the
results to the AES without the WSD processing. We tested our model using 304
essays from dataset of 1400 pre-scored essays while the remaining were used to
train the AES. There were two human rater scores for each essay that ranged
from 1 to 6, where 6 was the highest grade.
The scores from the first human rater were used to learn the scoring model.
The agreement between the model-scores with human raters is shown in Fig. 2.
AES with WSD could correctly classify 232 essays from the 304 essays in the
testing phase while the AES without WSD could classify 218 out of the 304
essays. The percentage of this is illustrated in the figure. It is interesting to note
that the human raters only agreed 57.9% of the time.
442 P. Nedungadi and H. Raj
5 Conclusion
This project has presented a new approach for automated essay scoring by taking
similarity-based word sense disambiguation method based on the use of context
vectors. In contrast with conventional approaches, this did not take the senses
of the target word for scoring; the proposed method measures the similarity
between gloss vectors of the target word, and a context vector comprising the
remaining words in the text fragment containing the words to be disambiguated.
This has been motivated by the belief that human beings disambiguate words
based on the whole context that contains the target words, usually under a
coherent set of meanings. Our results have shown that incorporating WSD to
the AES system improves accuracy against existing methods, as evaluated by
taking the KAPPA score.
We show that by taking the sense of the word in context to all the other
words in the sentence, we improved the inter-rater agreement of our model with
the human raters. Though our prediction model worked as well as the manually
evaluated teacher model, additional enhancements such as n-grams, grammar
specific errors, and other dimensionality methods such as PLSA can further
improve the inter-rater accuracy and are planned as further work.
The proposed system is applicable for essay with raw text. In future the
proposed work will be extended towards the grading of essays containing text,
tables and mathematical equations.
Acknowledgment. This work derives direction and inspiration from the Chan-
cellor of Amrita University, Sri Mata Amritanandamayi Devi. We thank Dr.M
Unsupervised Word Sense Disambiguation for Automatic Essay Scoring 443
References
1. Valenti, S., Neri, F., Cucchiarelli, A.: An overview of current research on automated
essay grading. Journal of Information Technology Education 2, 319–330 (2003)
2. Sinha, R., Mihalcea, R.: Unsupervised Graph-based Word Sense Disambiguation
Using Measures of Word Semantic Similarity. In: IEEE International Conference on
Semantic Computing (2007)
3. Abdalgader, K., Skabar, A.: Unsupervised similarity-based word sense disambigua-
tion using context vectors and sentential word importance. ACM Trans. Speech
Lang. Process. 9(1) (2012)
4. Kakkonen, T., Myller, N., Sutinen, E., Timonen, J.: Comparison of Dimension Re-
duction Methods for Automated Essay Grading. Educational Technology and Soci-
ety 11(3), 275–288 (2008)
5. Nedungadi, P., Jyothi, L., Raman: Considering Misconceptions in Automatic Essay
Scoring with A-TEST - Amrita Test Evaluation & Scoring Tool. In: Fifth Inter-
national Conference on e-Infrastructure and e-Services for Developing Countries
(2013)
Disjoint Tree Based Clustering and Merging
for Brain Tumor Extraction
1 Introduction
Clustering helps to bring items as closely as possible on the basis of similarity and
forming a group. The formed group, which was called cluster, has low variance with-
in the cluster and high variance between the clusters which gives the intension that
within the group the items are more similar in certain sense as compared to other
items in different clusters. Clustering can also be used in many application fields,
including machine learning, pattern recognition, image analysis, information retrieval,
and bioinformatics. Since there is no prescribed notation of cluster [1] that’s why
there are numerous clustering algorithms exists but every algorithm have a single
motive of grouping the data points. Some of the mainly used clustering models [2] are
Connectivity model, Density Model, Centroid based model and Graph based Models.
The literature helps to find that existing clustering algorithms for images had clus-
ter out the region based on either of some predefined threshold value like Otsu thre-
shold based clustering [4] which results in generation of cluster which was dependent
on specific threshold value or clustering was dependent on randomization. Some of
the randomization algorithms like K-means and K-centroid [3] give significant results
with certain limitations like initial selection of the value of k and the selection of the
random centroid point for each k clusters. These clustering approaches were known as
exclusive clustering as the pixel which belongs to a definite cluster doesn’t include in
any other cluster.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 445
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_52, © Springer International Publishing Switzerland 2014
446 A. Vidyarthi and N. Mittal
Medical Image Computing is one such field where the interaction of various do-
main experts of medicine, computer science, data science, and mathematics field
contribute collectively [6]. The main focus of the experts is to extract the relevant
information which was present in the medical image for better analysis of the subject
matter. Various imaging modalities are present for the studying of these abnormalities
like Computed Tomography (CT), Magnetic Resonance (MR), Positron Emission
Tomography (PET) and Magnetic Resonance Spectroscopy (MRS) [9]. Among all
imaging modalities MR imaging is widely used for the diagnosis of the tumor patients
as it provides the good contrast over the soft tissues of the body without taking the use
of radiations while other technique like CT scans and PET uses radiations to identify
the problems which were quite harmful to the body. These MR images are being ana-
lyzed by the radiologist to manually segment the abnormality regions.
The rest of this paper is organized as follows: Section 2 discusses about the pre-
vious literature survey. Section 3, describes the proposed clustering approach for
tumor extraction. Section 4, gives the results followed by conclusion in Section 5.
2 Related Work
Segmentation of the abnormality from the MR imaging modalities is a huge concern
for the radiologist. In the past various distinguished approaches and methods have
been proposed based on different approaches like edge detection [10], Otsu threshold
[7] and fuzzy logics [5, 8] for fast and effective segmentation. The literature suggests
that all the previously applied methods and approaches basically dependent on initial
selection of some threshold value which divides the image into sub regions based on
that threshold value.
An edge based approaches like modified Ant Colony Optimization (ACO) algo-
rithm using probabilistic approach [11] and a canny edge based approach with wa-
tershed [10] algorithm are used for the extraction of the tumor region from brain MR
image gives satisfactory results but due to weak gradient magnitude the algorithm
works effectively well only for high contrast images and degrades performance with
low contrast images. The various clustering based segmentation approaches like k-
means and k-centroid [3], works well for the extraction of tumors but as the algorithm
was based on initial selection of the value of K and random selection of k-centroids so
every time when the algorithm runs shows variation. These methods have the limita-
tion that they work well only for convex shape problems.
In [7], authors used hybrid technique for tumor segmentation from MR image using
Otsu threshold with fuzzy c-means algorithm. Otsu threshold was used to find out the
homogenous regions in an image followed by the segmentation of these homogenous
regions using fuzzy clustering approach. Another hybrid mechanism based upon Otsu
and Morphological operation [12] was also used for clustering out the tumor region
from MR image. Morphological operation was used to cluster the tumor region based
on the selection of seed point which was identified using the Otsu threshold. These algo-
rithms gave significant results but the initial selection of the threshold value for Otsu
algorithm is still a concern. Hierarchical clustering algorithm like HSOM with FCM
[18] was used for abnormality (tumor) extraction from medical images. HSOM basical-
ly used for mapping the higher dimensional data to the lower dimensional space.
Disjoint Tree Based Clustering and Merging for Brain Tumor Extraction 447
The key idea of such is to place the closely weighted nodes of the vector graph together
and using fuzzy c-means to form the cluster using adaptive threshold value.
Some of the automatic tumor segmentation algorithms [13, 14] cluster the tumor
region based on the automatic selection of the seed point which was used by the fuzzy
connectedness approach. Such an approach resolves problem of random selection of
seed point but for the extraction of the tumor, algorithm was still dependent on certain
threshold value. Some other clustering approaches based on symmetry analysis [15],
area constrained based clustering [16] and probabilistic approach using Hidden Mar-
kov Model [17] was also tested for tumor extraction but all the algorithms somehow
used threshold value to segment the tumor region from MR image.
3 Proposed Approach
For the extraction of the tumor region in MR image, a new hierarchical clustering
approach is introduced which is based upon the concept of graph theory. Proposed
approach focuses on generating disconnected trees followed by merging of the trees
for clustering for the tumor region inside an image. The algorithm is free from any
initial selection of the threshold value. The proposed clustering algorithm is given in
Algorithm 1.
After preprocessing of the input brain MR image, disjoint trees be generated based on
the concept of graph theory. Intensity values of the image are considered as vertices
of the graph. As per the definition of the clustering, the main focus is to bring closely
the items of similar type, thus key idea of making disjoint trees is to connect the in-
tensity values which are same at one tree. The key concept is shown by an example in
Fig. 1. A sample matrix is shown in Fig.1(a) and its corresponding disjoint trees are
shown in Fig. 1(b) (for simplicity only two trees are shown).
Each tree consists of items of the items in form of intensity values of similar type.
When the node is initialized for a tree then the algorithm will find all such nodes in an
image having same intensity value as root node and join to the tree which results in
generation of the disjoint trees having variation in root values. Initially algorithm
assumes that all these disjoint trees are forming clusters having same intensity values
as shown in Fig. 2.
In this phase, the corresponding generated clusters from above step were merged on
the basis of minimum distance criteria. Since for an 8 bit image, the pixel range for
gray image lies between in a range of 0-255. Thus for an image of 8 bit the maximum
possible generated cluster will be lay in the range of:
1 ≤ C ≤ 256 (1)
As the size of the image increases, the corresponding number of clusters was also
increased. In a worst case (for our research work we are using 8bit image only) the
maximum number of clusters be 256, while in average case the range of cluster lies
between as shown in equation (1), which itself is a huge number and also increases
complexity. Now these clusters were merged on the basis of finding minimum
distance between the trees. Before applying the distance criteria, all the generated
Disjoint Tree Based Clustering and Merging for Brain Tumor Extraction 449
clusters as shown in Fig. 2 were replaced by the mean value of the cluster. These
values are used for finding the minimum distance from each cluster of tree to every
other cluster as shown in Fig. 3.
On the basis of minimum distance between the clusters, the corresponding clusters
were merged to form a new cluster whose value is being replaced by the maximum of
the value of the corresponding clusters to be merged as shown in Fig. 4.
The key idea for replacing the value of new cluster with maximum of the value
joining clusters is for maximizing the inter cluster distance as much as possible and
minimizing intra cluster distance. Thus the formed cluster is differentiated to each
other as much as possible. Since proposed methodology uses intensity values for tree
formation and merging, thus higher intensity value gives brightness and lower intensi-
ty value gives dullness to an image. On the basis of this key concept it was found
thatmaximum the difference formed between two clusters better be the visibilitybet-
ween the clustered regions of an image seen.
450 A. Vidyarthi and N. Mittal
4 Experimental Results
4.1 Dataset Used
For the research work, 25 different modality brain tumor MR images were collected
from the Department of Radiologist, Sawai Man Singh (SMS) medical College Jaipur,
Rajasthan, India. All the data sources are in DICOM format which has various mo-
dalities for a single patient like T1-weighted, T2-Weighted, T1-FLAIR, T2-FLAIR,
post contrast, axial slices brain images. In current study, segmentation of abnormal
brain regions with the probability of tumor was calculated on various types of MR
Imaging slices.
(a) (b)
(c)
Fig. 6. Comparative simulation results of k-means algorithm with proposed algorithm on the
basis of execution time for (a) T1 weighted (b) eT1 weighted (c) T2 flair post contrast brain
MR images
5 Conclusion
In this paper a new hierarchical clustering algorithm is proposed for the extraction of
the tumor region from brain MR Image. Proposed algorithm is based on the concept
of graph theory where the image is being clustered on the basis of finding the maxi-
mum possible disjoint trees in an image followed by the merging of the trees and
generating k clusters using minimum distance metric concept. The proposed algo-
rithm was tested on various values of k and the experimental results show that the
proposed algorithm gives significant results by giving maximum possible abnormality
region in one cluster on k = 4. Proposed algorithm was also compared by the standard
K-means algorithm for various runs.
Acknowledgement. We are thankful to the Sawai Man Singh (SMS) Medical College
Jaipur for providing us the original brain tumor images. We would also like to thanks
Dr. Sunil Jakhar, MD, Radio Diagnosis, Department of Radiology, SMS Medical
College Jaipur for helping us in verifying the results.
References
1. Estivill-Castro, V.: Why so many clustering algorithms - A Position Paper. ACM SIGKDD
Explorations Newsletter 4(1), 65–75 (2000)
2. Han, J., Kamber, M.: Data Mining: concepts and techniques, 2nd edn. Morgan Kaufmann
Publishers (2006)
452 A. Vidyarthi and N. Mittal
3. Moftah, H.M., Hassanien, A.E., Shoman, M.: 3D Brain Tumor Segmentation Scheme us-
ing K-mean Clustering and Connected Component Labeling Algorithms. In: 10th Interna-
tional Conference on Intelligent System and Design Application, pp. 320–324 (2010)
4. Zhang, J., Hu, J.: Image Segmentation Based on 2D Otsu Method with Histogram Analy-
sis. In: International Conference on Computer Science and Software Engineering, pp. 105–
108 (2008)
5. Chuang, K.S., Tzeng, H.L., Chen, S., Wu, J., Chen, T.J.: Fuzzy c-means clustering with
spatial information for image segmentation. Journal of Computerized Medical Imaging and
Graphics 30(1), 9–15 (2006)
6. [Online] http://en.wikipedia.org/wiki/Medical_image_computing
7. Nyma, A., Kang, M., Kwon, Y.K., Kim, C.H., Kim, J.M.: A Hybrid Technique for Medi-
cal Image Segmentation. Journal of Biomedicine and Biotechnology (2012)
8. Mohamed, N.A., Ahmed, M.N., Farag, A.: Modified fuzzy c-mean in medical image seg-
mentation. In: 20th Annual International Conference on Engineering in Medicine and Bi-
ology Society (1998)
9. [Online] http://en.wikipedia.org/wiki/Medical_imaging
10. Maiti, I., Chakraborty, M.: A new method for brain tumor segmentation based on wa-
tershed and edge detection algorithms in HSV colour model. In: National Conference on
Computing and Communication Systems (2012)
11. Soleimani, V., Vincheh, F.H.: Improving Ant Colony Optimization for Brain MRI Image
Segmentation and Brain Tumor Diagnosis. In: First Iranian Conference on Pattern Recog-
nition and Image Analysis (2013)
12. Vidyarthi, A., Mittal, N.: A Hybrid Model for the Extraction of Brain Tumor in MR Im-
age. In: International Conference on Signal Processing and Communication (2013)
13. Weizman, L., Sira, L.B., Joskowicz, L., Constantini, S., Precel, R., Shofty, B., Bashat,
D.B.: Automatic segmentation, internal classification, and follow-up of opticpathway gli-
omas in MRI. Journal of Medical Image Analysis 16, 177–188 (2012)
14. Harati, V., Khayati, R., Farzan, A.: Fully Automated Tumor Segmentation based on Im-
proved Fuzzy Connectedness Algorithm in Brain MR Images. Journal of Computers in Bi-
ology and Medicine 41, 483–492 (2011)
15. Khotanlou, H., Colliot, O., Atif, J., Bloch, A.: 3D brain tumor segmentation in MRI using
fuzzy classification, symmetry analysis and spatially constrained deformable models.
Journal of Fuzzy Sets and System 160, 1457–1473 (2009)
16. Niethammer, M., Zach, C.: Segmentation with area constraints. Journal of Medical Image
Analysis 17, 101–112 (2013)
17. Solomon, J., Butman, J.A., Sood, A.: Segmentation of Brain Tumors in 4D MR Images us-
ing Hidden Markov Model. Journal of Computer Methods and Programs in Biomedi-
cine 84, 76–85 (2006)
18. Logeswari, T., Karnan, M.: An Improved Implementation of Brain TumorDetection Using
Segmentation based onHierarchical Self Organizing Map. International Journal of Com-
puter Theory and Engineering 2(4), 591–595 (2010)
Segmentation of Acute Brain Stroke from MRI of Brain
Image Using Power Law Transformation with Accuracy
Estimation
Abstract. Segmentation of acute brain stroke and its position is very important
task in medical community. Accurate segmentation of brain's abnormal region
by computer aided design (CAD) system is very difficult and challenging task
due to its irregular shape, size, high degree of intensity and textural similarity
between normal areas and abnormal regions areas. We developed a new method
using power law transformation which gives very fine result visually and from
quantifiable point of view. Our methods gives very accurate segmented tumor
output with very low error rate and very high accuracy.
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 453
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_53, © Springer International Publishing Switzerland 2014
454 S. Roy, K. Chatterjee, and S.K. Bandyopadhyay
also. The symptoms experienced by the patient will depend on which part of the brain
is affected.
Symptoms of stroke include numbness, weakness or paralysis, slurred speech,
blurred vision, confusion and severe headache. R K. Kosioret et al.[1] proposed au-
tomated MR topographical score using digital atlas to develop an objective tool for
large-scale analysis and possibly reduce interrater variability and slice orientation
differences. The high apparent diffusion coefficient lesion contrast allows for easier
lesion segmentation that makes automation of MR topographical score easier. Neuro
imaging plays a crucial role in the evaluation of patients presenting acute stroke
symptoms. While patient symptoms and clinical examinations may suggest the
diagnosis, only brain imaging studies can confirm the diagnosis and differentiate
hemorrhage from ischemia with high accuracy. C S. Kidwell et al.[2] compared the
accuracy of MRI and CT for detection of acute intra cerebral hemorrhage in patients
presenting with acute focal stroke symptoms and when it became apparent at the time
of an unplanned interim analysis that MRI was detecting cases of hemorrhagic trans-
formation not detected by CT. Thus MRI image used is better than CT image for
brain stroke detection. Chawla et al.[3] detected and classifed an abnormality into
acute infarct, chronic infarct and hemorrhage at the slice level of non-contrast CT
images. A two-level classification scheme is used to detect abnormalities using fea-
tures derived in the intensity and the wavelet domain. Neuroimaging in acute stroke
[4], [5], [6] is essential for establishment of an accurate diagnosis, characterization of
disease progression, and monitoring of the response to interventions. MRI has demon-
strably higher accuracy, carries fewer safety risks and provides a greater range of
information than CT.
We use the data set of “Whole Brain Atlas” image data base [7] which consist of
T1 weighted, T2 weighted, proton density (PD) MRI image and use the part of the
dataset slice wise. The paper is organized as follows: In section 2 we describes our
proposed methodology ; In section 3 we discussed about our results; In section 4 we
do quantification and accuracy estimate of the method with mathematical metric ; and
finally in section 5 we conclude our paper.
2 Proposed Methodology
A RGB image has been taken as an input and converted into gray image first.
Binarization is very effective preprocessing methods for most of the segmentation for
the MRI imaging technique. Due to large variation of background and foreground of
MRI of brain images maximum number of binarization technique fails, and here a
binarization method developed by Roy et al.[9] has been selected with a global
threshold value which has been selected by standard deviation of the image. Global
thresholding using standard deviation of image pixels gives very good results and
binarize each interesting part of the MRI image. A image I[i, j] and h is the intensity
of each pixel of the gray image. Thus the total intensity of the image is defined by:
=
Segmentation of Acute Brain Stroke from MRI of Brain Image 455
The mean intensity of the image is defined as the mean of the pixel intensity within
that image and the mean intensity is defined as Imean by:
1
= ,
(, )
The standard deviation Sd of the intensity within a image is the threshold value of the
total image is defined by:
1
= ( , )
1
,
or
1
= ,
1
,
Here the threshold intensity as global value i.e. the threshold intensity of the entire
image is unique. The binarized image using standard deviation intensity Sd of the
image pixel of a image I[i, j] or matrix element for I[i, j] is given by :
, =1 ,
, =0 ,
After complimenting the binary image the two dimensional wavelet decompositions
[10] have been done using ‘db1’ wavelet up to second level and re-composition of the
image has been done using the approximate coefficient. The objectives of these two
steps are to remove the detailed information from the complementary image which
helps to remove skull from the brain. The previous process have done as separation of
skull to the brain portion as skull and brain may or may not connected, thus previous
process gives the surety that skull and brain portion are not connected and results in
decrease in size of the complementary image to half of the original image, moreover
due to reduction of size and removal of detailed information the white pixel of the
complementary image come closure and form a complete ring. Then interpolation
method is used to resize the image of the previous step to the original size then re-
complement of the image is done and these results produce the complete separation
between brain and skull. Then labeling of the image has been done using union find
method. Except maximum area all other component are removed, and in this process
skull and other artefact are removed. Actually removal of artefact and skull improve
the detection quality and reduce the error rate of false detection.
As the image contains one pixels structure are discrete in nature that’s why quick-
hull [11], [12], [13] algorithm for Convex Hulls is used here to generate original im-
age without artefact and skull. It is computed for these one pixel and the entire pixels
inside the convexhull are set to one and outside it are set to zero. Here if some of the
brain part where set to zero during the binarization stages due to wrong evaluation
this step ensure that the error is corrected. Then obtained binarized image is multi-
plied with the original image pixel wise and produce the desired results i.e. MRI of
456 S. Roy, K. Chatterjee, and S.K. Bandyopadhyay
brain image without artifacts and border and is stored ‘r’. Power-law transformations
[14] have the basic form
=
Where c and y are positive constants and sometimes power law equation can be
written as high to account for an offset (that is, a measurable output when the input is
zero)
= ( )
However, offsets typically are an issue of display calibration and as a result they are
normally ignored, using power law a family of possible transformation can obtained
simply by varying the value of y and curves generated with values of y>1 have exact-
ly the opposite effect as those generated with values of y<1. Using this power law
transformation we increase the visibility of the abnormal region(stroke) by taking y as
4 and we can segment easily the abnormal region (like acute stroke speaks nonsense
words, acute stroke speech arrest caused by the portion of brain) by summing average
intensity of s with standard deviation of s of non black region.
1
Total = , ( )
(, )
Depending on this total threshold intensity we can find the abnormal region of brain
stroke from MRI of brain. Acute stroke region are the more intense part of the brain
and also the different intensity form the normal brain intensity, so intensity of tumor
is more than that of other region. Then first select average intensity value of the non
black region and calculate the intensity value by standard deviation. As the intensity
level of acute stroke and other abnormalities are greater than that of other region of
the brain that’s why using the total intensity thresholds value that can easily detect the
abnormality and other abnormalities of brain from MRI. For abnormality position
detection we need to calculate centroid of the brain stroke region, this is done by
weighted mean of the pixels and mathematical formulation of calculation of centroid
is given below:
=
Where p is the total number of pixels; in is the individual pixel weigh. After calculat-
ing the centroid we can easily find the abnormal regions or brain stroke position from
the center position of the brain. Here distance between brain centers to centroid of the
abnormal regions has been calculated as center to centroid distance using X coordi-
nate and Y coordinate named as Xcood and Ycood. Distance between brain top positions
to stroke regions top position named as TT, distance between brain left position to
stroke regions left position named as LL, distance between brain right position to
stroke regions right position named as RR, and brain bottom position to stroke regions
bottom are named as BB.
Segmentation of Acute Brain Stroke from MRI of Brain Image 457
Fig. 1(b) is the binarized output for an input Fig. 1(a) by the standard deviation
method and this binarization is very helpful for removing the skull of the brain. Fig.
1(c) is the output of the brain without skull and artifact which helps to detect the brain
stroke region and its helps accurate (without false detection) detection. Fig. 1(d) is
output of brain image using power law transformation and from this image abnormal
region is truly visible and finally in Fig. 1(e) is the segmented part. Finally in Fig.
1(g) shows the position of abnormal part according to the hemisphere. Some other
results of brain MRI images from standard database (slice wise) with ground truth
image are shown in below.
458 S. Roy, K. Chatterjee, and S.K. Bandyopadhyay
I1 PL1 S1 M1 PO1
I2 PL2 S2 M2 PO2
I3 PL3 S3 M3 PO3
I4 PL4 S4 M4 PO4
Fig. 2. (I1, I2, I3, I4) are the input MRI of brain image, (PL1, PL2, PL3, PL4) are the output
after using power law transformation and without skull, (S1, S2, S3, S4) are the segmented
region, (M1, M2, M3, M4) are the ground truth or reference image, (PO1, PO2, PO3, PO4) are
showing different position of segmented and brain image.
Table 1 shows the qualifications of different type of brain MR images that are used
in thestudy to measure centroid, segmented area and their locations. The centroid
position are defined by ‘X’ and ‘Y’ coordinate (here Xcood, Ycood). Positive X coordi-
nate means centroid located right part of the brain from viewer side and negative X
means left part of the brain from viewer side and TT,BB, LL,RR are described in the
methodology section.
Table 1. Distances between different positions of segmented to the brain portion and area of
segmented area in pixels
First let AV and MV denote [15] the area of the automatically and manually seg-
mented objects and |x| represents the cardinality of the set of voxels x. In the follow-
ing equations Tp = MV ∩ AV, Fp = AV − Tp and Fn = MV − Tp denote to the “true
positive”, “false positive” and “false negative” respectively. The Kappa index be-
tween two area is calculated by the following equation:
(2| |)
( , )= 100%
(| | | |)
The similarity index is sensitive to both differences in size and location. The
Jaccard index between two areas is represented as follow:
| |
( , )= 100%
| |
This metric is more sensitive to differences since both denominator and numerator
change with increasing or decreasing overlap. Correct detection ratio or sensitivity is
defined by the following equation:
| |
= 100%
False detection ra tio (Fd) is same except ‘AV-Tp’ in place of ‘AV MV’. The Rel-
ative Error [8] (RE) for stroke region can be calculated as “AV” stroke area using
automated segmentation , “MV” is stroke area using manual segmentation using
expert.
( )
= 100%
460 S. Roy, K. Chatterjee, and S.K. Bandyopadhyay
Table 2 shows the specifications of different type of brain MR images that are used
in the study to measure the accuracy level and different type of image metrics. The
results that are obtained in this study from Table2 are described here after.
Table 2. Value of different metric for accuracy estimation from ground truth image
Image AV MV RE in Tp Fp Fn Kappa Jacard % %
name % index index of Cd of Fd
12_T2 972 927 4.8543 927 45 00 0.9763 0.9537 100.00 4.8544
13_PD 952 968 1.6528 932 20 36 0.9708 0.9433 96.280 2.0700
13_T2 1394 1297 7.4787 1297 97 00 0.9639 0.9304 100.00 7.48
14_PD 2582 2584 0.0773 2582 00 02 0.9996 0.9992 99.920 0.00
14_T2 2168 2222 2.4302 2000 168 222 0.9112 0.8368 90.010 7.560
16_PD 1994 1821 8.6760 1821 173 00 0.9547 0.9132 100.00 9.5003
16_T2 1903 1872 1.6559 1850 53 32 0.9801 0.9560 98.824 2.8648
17_PD 1245 1291 3.5631 1245 00 46 0.9819 0.9644 96.440 0.00
17_T2 872 815 6.9938 801 71 14 0.9496 0.9040 98.282 8.7116
18_PD 673 691 2.6049 666 07 25 0.9765 0.9542 96.380 1.010
18_T2 669 622 7.5562 609 60 13 0.9419 0.8841 97.909 9.6946
The similarity index or similar is sensitive to both differences in size and location.
For the similarity index, differences in location are more strongly reflected than dif-
ferences in size and Si > 70% indicates a good agreement. In our experiment Kappa
index reaches above 90% maximum times thus according to the Kappa index our
methods produce very good results. Jacard index is more sensitive to differences since
both denominator and numerator change with increasing or decreasing overlap and in
our methodology maximum times it gives greater than 90% which indicates our ex-
periment promising. Correct detection ratio indicates the correct detection area nor-
malized by the reference area and is not sensitive to size. Therefore correct detection
ratio solely cannot indicate the similarity and should be used with false detection ratio
or other volume metrics. False detection ratio shows the error of the segmentation
and indicates the volume that is not located in the true segmentation. Using this metric
with the correct detection ratio can give a good evaluation of the segmentation. Max-
imum times our methodology gives above 95% correct detection ratio and below 5%
false detection ratio with very low relative area error (maximum times below 5%).
However, the overlap measure depends on the size and the shape complexity of the
object and is related to the image sampling. Thus our methodology gives very good
results visually as well as quantifiably.
5 Conclusion
the future, to make the system understanding with sequence different MRI, we may
need to capture new knowledge and tune the corresponding rules or parameters for
more accurate CAD system.
References
1. Kosior, R.K., Lauzon, M.L., Steffenhagen, N., Kosior, J.C., Demchuk, A., Frayne, R.: At-
las-Based Topographical Scoring for Magnetic Resonance Imaging of Acute Stroke.
American Heart Association 41(3), 455–460 (2010)
2. Kidwell, C.S., Chalela, J.A., Saver, J.L., Starkman, S., Hill, M.D., Demchuk, A.M.,
Butman, J.A., Patronas, N., Alger, J.R., Latour, L.L., Luby, M.L., Baird, A.E., Leary,
M.C., Tremwel, M., Ovbiagele, B., Fredieu, A., Suzuki, S., Villablanca, J.P., Davis, S.,
Dunn, B., Todd, J.W., Ezzeddine, M.A., Haymore, J., Lynch, J.K., Davis, L., Warach, S.:
Comparison of MRI and CT for Detection of Acute Intracerebral Hemorrhage. The Journal
of the American Medical Association 292(15) (2004)
3. Chawla, M., Sharma, S., Sivaswamy, J., Kishore, L.: A method for automatic detection
and classification of stroke from brain CT images. In: Proceedings of IEEE Engineering in
Medicine and Biology Society, pp. 3581–3584 (2009)
4. Perez, N., Valdes, J., Guevara, M., Silva, A.: Spontaneous intracerebral hemorrhage image
analysis methods: A survey. Advances in Computational Vision and Medical Image Pro-
cessing Computational Methods in Applied Sciences 13, 235–251 (2009)
5. Smith, E.E., Rosand, J., Greenberg, S.M.: Imaging of Hemorrhagic Stroke. Magnetic Res-
onance Imaging Clinics of North America 14(2), 127–140 (2006)
6. Merino, J.G., Warach, S.: Imaging of acute stroke. Nature reviews. Neurology 6, 560–571
(2010)
7. Online, http://www.med.harvard.edu/AANLiB/cases/
8. Roy, S., Nag, S., Maitra, I.K., Bandyopadhyay, S.K.: A Review on Automated Brain Tu-
mor Detection and Segmentation from MRI of Brain. International Journal of Advanced
Research in Computer Science and Software Engineerin 3(6), 1706–1746 (2013)
9. Roy, S., Dey, A., Chatterjee, K., Bandyopadhyay, S.K.: An Efficient Binarization Method
for MRI of Brain Image. Signal & Image Processing: An International Journal
(SIPIJ) 3(6), 35–51 (2012)
10. Barber, C.B., Dobkin, D.P., Huhdanpaa, H.: The Quickhull Algorithm for Convex Hulls.
ACM Transactions on Mathematical Softwar 22(4), 469–483 (1996)
11. Daubechies, I.: Ten lectures on wavelets. CBMS-NSF conference series in applied mathe-
matics. SIAM Ed. (1992)
12. Mallat, S.: A theory for multiresolution signal decomposition: the wavelet representation.
IEEE Transactions on Pattern Analysis and Machine Intelligence 11(7), 674–693 (1989)
13. Meyer, Y.: Ondelettesetopérateurs, Tome 1, Hermann Ed. (1990); English translation:
Wavelets and operators. Cambridge Univ. Press (1993)
14. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. (2002)
15. Khotanlou, H.: 3D brain tumors and internal brain structures segmentation in MR images.
Thesis (2008)
A Hough Transform Based Feature Extraction Algorithm
for Finger Knuckle Biometric Recognition System
1 Introduction
Finger knuckle print of a person has inherent structure patterns present in the outer
surface of the finger back region which is found to be highly unique and stable birth
feature (invariant throughout the lifetime). These patterns have greater potentiality
towards the unique identification of individuals which in turn contributes a high
precision and rapid personal authentication system [1].
The various other hand based biometric modalities are fingerprints, palmprints,
finger geometry, hand geometry and hand vein structures etc., Fingerprints have
greater vulnerability towards tapping of finger patterns when it is left on the surface
of the image acquiring device. On the other hand, the feature information extracted
from finger geometry and hand geometry are found to have less discriminatory power
when size of the data sets grows exponentially [2]. In palm prints, the region of inter-
est captured for feature extraction is large in size which may result in computational
overhead. The hand vein structures biometric modality has the main drawback of
complex capturing system to conquer the vein patterns of the hand dorsum surface.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 463
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_54, © Springer International Publishing Switzerland 2014
464 U. Kazhagamani and E. Murugasen
Unlike fingerprints, finger knuckle prints are very difficult to scrap because it is
captured by means of contact less capturing devices and patterns captured are present
in the inner surface of the finger knuckle. Moreover, the presence of phalangeal joint
in the finger knuckle surface creates flexion shrinks which has rich texture patterns
like lines, creases and wrinkles and size of the captured finger knuckle print is very
small when compared to palm prints, which reduces the computational overhead.
In general, the recognition methods for finger knuckle print matching are catego-
rized into two broad categories viz., Geometric based methods and Texture based
methods [3]. In texture analysis methods, the feature information is extracted by
means of analyzing the spatial variations present in the image. In this analysis, the
mathematical models are used to characterize spatial variation in terms of feature
information. The spatial quantifiers of an image can be derived by analyzing the dif-
ferent spectral values that is regularly repeated in a region of large spectral scale.
Texture analysis on the digital image results in a quantification of the image the
texture properties.
Kumar and Ravikanth [4] were the first to explore texture analysis method for fin-
ger knuckle biometrics. In their work, the feature information of captured finger
knuckle print is extracted by means of statistical algorithms viz, principal component
analysis, linear discriminant analysis and Independent component analysis. Further,
Zhang et al. [5] derives a band limited phase only correlation method (BPLOC) for
matching finger knuckle prints based on texture analysis. Linlin shen et al. [6] used
two dimensional Gabor filters to extract feature information from the finger knuckle
print image texture and uses hamming distance metric to identify the similarities and
the differences between the reference and input images of finger knuckle print. Fur-
thermore, Chao lan et al. [7] proposes a new model based texture analysis methods
known as complex locality preserving projection approach for discriminating the
finger knuckle surface images. Lin Zhang et al. [8] used ridgelet transform method to
extract the feature information from the captured FKP images. A.Merounica et al., [9]
have shown the implementation of Fourier transform functions for deriving the
feature information from the finger knuckle region and palm region.
From the study conducted, it has been found that texture methods provides high
accuracy rate when the captured image is of high quality images. But, in the case of
any missing information in the input image and also in the case of low quality images
like noisy images, these methods fail to work. This paper addresses this problem by
proposing new methodology for feature extraction from finger knuckle print based on
Elliptical Hough Transform. This feature extraction process is robust towards missing
information and well tolerant to noise when the FKP images are captured partially and
by using low quality sensors respectively.
The following Fig.1 illustrates the design of the proposed biometric recognition sys-
tem based on finger knuckle print. The proposed biometric system gets the images of
A Hough Transform Based Feature Extraction Algorithm 465
Elliptical Hough
Image Acqui- Preprocessing Transform based Fea-
sition and ROI ex- ture Extraction
FBKS
Authentication DATABAS
Fig. 1. Block diagram of proposed biometric authentication system based on finger knuckle
print
right index finger knuckle, left index finger knuckle, right middle finger knuckle and
left middle finger knuckle are given as input to the proposed model.
Initially, preprocessing and extraction of region of interest of the finger knuckle
print is done based on edge detection method. Secondly, Elliptical Hough Transform
based feature extraction process is done to derive the feature information from the
finger knuckle print. Thirdly, the extracted feature information is represented to form
the feature vector. This vector information is passed to the matching module which
implemented using correlation coefficient. Finally, matching scores generated from
different fingers are fused based on matching score level fusion to make the final
decision on identification.
The preprocessing and ROI extraction is done on the captured FKP in order to extract
a portion of the image which is rich in texture patterns. These processes enable to
extract highly discriminative feature information from the captured FKP image even
though it varies according to their scaling and rotational characteristics. The prepro-
cessing of the captured FKP image is done by incorporating the coordinate system
[10]. This is achieved by defining the x- axis and y-axis for the captured image.
The base line of the finger is taken as x-axis. The y-axis for the knuckle patterns in the
finger is defined by means of convex curves determined from the edge record of
the canny edge detection algorithm [11]. The curvature convexities of the obtained
convex curves are determined for finger knuckle patterns. The Y axis of the finger knuckle
print is derived by means of the curvature complexity which is nearly equal to zero at
the center point. By this method, ROI is extracted from the FKP which of 110x180 pixel
size.
466 U. Kazhagamani and E. Murugasen
(a) (b)
(c) (d)
Fig. 2. (a) Captured finger knuckle print,(b) FKP image subjected to canny edge detection
algorithm,(c) Convex Curves of FKP image,(d) Coordinated system for the FKP image
The above Fig. 2 (a), (b),(c), and (d) illustrates the captured finger knuckle print,
edge image of the FKP detected using canny edge detection algorithm, convex curves
detected from FKP image and coordinated system constructed for FKP image.
In Cartesian coordinates the elliptical structure can be using the foci points and
curve base points using the following Eq. 1,
= ( ) ( ) (1)
Where is the sum of the distances from the two foci points to the base point of
the curve.
The set of curve points can be defined by the relation shown in Eq. 2.
, , , , , , = ( ) ( ) (2)
The set of curve points with the forms the ellipse using the curve tracing algorithm
[15] , with following Eq. 3.
, , , , , , = ( )
( ) =
(3)
Hence the knuckle pixel points are identified from the pattern generated from the
captured finger knuckle prints, which are transformed in to a parametric representa-
tion to derive in the form of elliptical structure. The knuckle feature information can
be obtained by analyzing the elliptical structure.
The distance from the center to the focus point defined as knuckle foci distance
and given by Eq. 4.
( )
= (4)
The primary and secondary axes of the knuckle elliptical structure, which can be
defined as primary knuckle axis ( ) and secondary knuckle axis ( ),are given by
Eq. 5 and Eq. 6.
= (5)
2
(6)
=
2
Further, knuckle base point angle ( ), can be derived from the knuckle elliptical
structure as given by Eq. 7.
= tan (7)
The obtained finger knuckle feature information is stored in the vector named as
, represents the vector obtained from the registered finger knuckle
image and represents the vector obtained from the input image of the finger
knuckle image.
2.4 Classification
The feature information from the finger knuckle print image were derived for four
different fingers belonging to the same individual and stored in the corresponding
feature vector. The matching process of different finger knuckle prints is done by
means of finding the correlation between reference and input vectors. The derivation
of correlation coefficient [13] for the classification process is done by means of the
following Eq.8.
= (8)
where are values taken from reference and input vectors respectively.
If the value of is close to 1, then high degree of similarity is identified, when the
value of diminishes to 0 indicates the dissimilarity.
rules can be used to combine scores obtained by from each of the finger knuckle. In this paper,
sum rule is used. In the sum rule, say that , , , represent score values obtained
from finger knuckle surface of left index finger, right index finger, left middle finger and right
middle finger respectively. The final score is computed using the following Eq. 9.
= (9)
From the obtained final score the authentication decision is taken.
The evaluation of the finger knuckle biometric recognition system based on Hough
transform is done by using PolyU database [15]. In this, finger knuckle print is cap-
tured using automated low cost contactless method using low resolution camera and
in peg free environment. The knuckle images were collected from 165 persons. Ex-
tensive experiments were conducted and performance analysis was done based on the
parameters viz., genuine acceptance rate and equal error rate.
with the system. Equal error rate is a point at which false acceptance rate and false
rejection rate becomes equal. The above shown Table 1 illustrates the values of the
genuine acceptance for the different combination of fusion.
From the above tabulated results, it is obvious that the finger knuckle print can be
considered as one of the reliable biometric identifier. The following graphical illustra-
tion in Fig.4 depicts the some of the tabulated results. From the results obtained, it is
clear that the accuracy of the system is considerably better when the single sample of
finger knuckle surface is used. Further, the accuracy gets increases as when fusion of
two, three finger knuckle regions was done. The best part of the accuracy can be
accomplished by fusing the entire four fingers knuckle regions.
97
96
95
Genuine Acceptance Rate %
94
93
92
91
LI+LM+RI
LI+LM+RM
90 LM+RI+RM
LI+RI+RM
89
0 1 2 3 4 5 6 7 8 9
False Acceptance Rate %
The proposed Hough transform based feature extraction method for finger knuckle
biometric recognition system is compared with the other geometric and texture based
methods of feature extraction. The following Table 2 illustrates comparative analysis
of the existing methods with the proposed methodology.
From the comparative analysis chart shown, it is obvious that the proposed Ellip-
tical Hough transform based feature extraction method outperforms the existing me-
thodology with high accuracy of Recognition rate = 98.26% and lowest error rate of
EER = 0.78%.
A Hough Transform Based Feature Extraction Algorithm 471
Recognition Me-
References Dataset Results
thods
A. Kumar et al. [4] Newly created Principal Component EER – 1.39%
database – 105 Analysis,
users with 630 Linear Discriminant
Images Analysis, Independent
Component
Analysis
Zhang et al. [5] PolyU database Band Limited Phase only EER - 1.68%
for Knuckle correlation method
database.
Chao lan et al. [7] PolyU database Complex locality pre- EER - 4.28%
for Knuckle serving approach
database.
Lin Zhang et al. [8] PolyU database Ridgelet Transform EER - 1.26%
for Knuckle
database.
4 Conclusion
This paper presents a novel method for feature extraction based on Elliptical Hough
Transform for the implementation of personal recognition system using finger
knuckle prints. In this, EHT is formulated to isolate the elliptical structures present in
the finger knuckle print and feature information obtained from those structures was
represented to identify the individuals. The rigorous experiments were conducted and
well promising results were achieved.
References
1. Hand-based Biometrics. Biometric Technology Today 11(7), 9–11 (2000)
2. Kumar, A., Zhang, D.: Combining fingerprint, palm print and hand shape for user authen-
tication. In: Proceedings of International Conference on Pattern Recognition, pp. 549–552
(2006)
472 U. Kazhagamani and E. Murugasen
3. Rowe, R.K., Uludag, U., Demirkus, M., Parthasaradhi, S., Jain, A.K.: A multispectral
whole-hand biometric authentication system. In: Biometric Symposium (2007)
4. Kumar, A., Ravikanth, C.: Personal Authentication Using Finger Knuckle Surface. IEEE
Transactions on Information Forensics and Security 4(1), 98–110 (2009)
5. Zhang, L., Zhang, L., Zhang, D.: Finger-Knuckle-Print Verification Based on Band-
Limited Phase-Only Correlation. In: Jiang, X., Petkov, N. (eds.) CAIP 2009. LNCS,
vol. 5702, pp. 141–148. Springer, Heidelberg (2009)
6. Shen, L., Bai, L., Zhen, J.: Hand-based biometrics fusing palm print and finger-knuckle-
print. In: 2010 International Workshop on Emerging Techniques and Challenges for Hand-
Based Biometrics (2010)
7. Jing, X., Li, W., Lan, C., Yao, Y., Cheng, X., Han, L.: Orthogonal Complex Locality Pre-
serving Projections Based on Image Space Metric for Finger-Knuckle-Print Recognition.
In: 2011 International Conference on Hand-based Biometrics (ICHB), pp. 1–6 (2011)
8. Zhang, L., Li, H., Shen, Y.: A Novel Riesz Transforms based Coding Scheme for Finger-
Knuckle-Print Recognition. In: 2011 International Conference on Hand-based Biometrics
(2011)
9. Meraoumia, A., Chitroub, S., Bouridane, A.: Fusion of Finger-Knuckle-Print and
Palmprint for an Efficient Multi-biometric System of Person Recognition. In: 2011 IEEE
International Conference on Communications (2011)
10. Bao, P., Zhang, L., Wu, X.: Canny Edge Detection Enhancement by Scale Multiplication.
IEEE Transactions on Pattern Analysis and Machine Intelligence 27(9), 1485–1490 (2005)
11. Zhanga, L., Zhanga, L., Zhanga, D., Zhub, H.: Online Finger-Knuckle-Print Verification
for Personal Authentication, vol. 1, pp. 67–78 (2009)
12. Olson, C.F.: Constrained Hough Transforms for Curve Detection. Computer Vision and
Image Understanding 73(3), 329–345 (1999)
13. Shen, D., Lu, Z.: Computation of Correlation Coefficient and Its Confidence Interval in
SAS, vol. 31, pp. 170–131 (2005)
14. Hanmandlu, M., Grover, J., Krishanaadsu, V., Vasirkala, S.: Score level fusion of hand
based Biometrics using T-Norms. In: 2010 IEEE International Conference on Technolo-
gies for Homeland Security, pp. 70–76 (2010)
15. Alen, J.V., Novak, M.: Curve Drawing Algorithms for Raster displays. ACM Transactions
on Graphic 4(2), 147–169 (1985)
An Efficient Multiple Classifier Based on Fast RBFN
for Biometric Identification
1 Introduction
Biometric technologies have immense importance in various security, access control
and monitoring applications. Single modal biometric systems repeatedly face signifi-
cant restrictions due to noise in sensed data, spoof attacks, data quality, non universal-
ity and other factors. Multimodal biometric authentication or multimodal biometrics is
the approach of using multiple biometric traits from a single user in an effort to
improve the results of the authentication process.
Various multimodal biometric systems can be found which propose traditional
fusion technique and use two biometric features for identification.
In paper [10] a fingerprint-iris fusion based identification system was proposed.
This proposed framework developed a fingerprint and iris fusion system which utilized
a single Hamming Distance based matcher for identification. Here iris feature
extraction was based on Daugman’s approach and was implemented by Libor Masek.
Fingerprint feature extraction was based on Chain code to detect minutiae. A simple
accumulator based fusion approach was employed here. The fingerprint, iris and fused
accuracy are 60%, 65% and 72% respectively.
Paper [11] proposed an iris and fingerprint fusion based technique based on
Euclidean distance matching algorithm. Here preprocessed and normalized data is
given to the Gabor filters and then extracted features were used for matching. The
accuracy of the multimodal system is 99.5% for threshold 1 compare to 99.1% and
99.3% for threshold 0.1 and 0.5 respectively.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 473
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_55, © Springer International Publishing Switzerland 2014
474 S. Kundu and G. Sarker
In paper [13] a fingerprint and iris fusion based recognition technique was proposed
using conventional RBF neural network. The features of fingerprint were extracted by
Haar wavelet based method and iris features were extracted by Block sum method. The
testing time for fingerprint, iris and fusion modality are 0.35, 0.19 and 0.12 seconds
respectively.
In our previous works [8], [9],[12], we have developed unimodal systems based on
RBFN with optimal clustering algorithm (OCA) and BP network for rotation inva-
riant, clear as well as occluded fingerprint and face identification and location
invariant localization.
In this paper, we propose a multiple classifier for person identification where each
classifier operates on different aspects of the input. There are three individual classifi-
ers for Fingerprint, Iris and Face identification and also a Super classifier which give
the proper identification of person based on voting logic considering the result of
three individual classifiers. The ANN model which is adopted here is the Radial Basis
Function (RBF) network with Optimal Clustering Algorithm (OCA) for training units
and Back Propagation (BP) learning for classification.
Holdout Method
In order to obtain a good measure of the performance of a classifier, it is necessary
that the test dataset has approximately the same class distribution as the training data-
set. The training dataset is that portion of the available labeled examples that is used
to build the classifier. The test dataset is that portion of the available labeled examples
that is used to test the performance of the classifier. In holdout method [3] the reason
for keeping aside some of the labeled examples for a test dataset is to test the
performance of the constructed classifier.
Accuracy
The accuracy [3] of a classifier is the probability of its correctly classifying records in
the test dataset. In practice, accuracy is measured as the percentage of records in the
test dataset that are correctly classified by a classifier. If there are only two classes
(say C and not C), then the accuracy is computed as:
= (1)
where ‘a’, ‘b’, ‘c’ and ‘d’ are defined in the matrix in Fig.1.
class or category. This is more appropriate to find out the system evaluation through a
particular numeric value.
2.1 Preprocessing
Fingerprint Images
The finger print image has to be preprocessed before learning as well as recognition.
There are several steps in preprocessing.
Iris Images
Preprocessing of the Iris image is also required before feature extraction. Iris prepro-
cessing comprises of several steps.
• Conversion of RGB iris images to grayscale images: The first step of prepro-
cessing is to convert training database or test database images into gray scale
images.
• Iris Boundary Localization: The radial-suppression edge detection algorithm [1]
is used to detect the boundary of iris which is similar to Canny edge detection
technique. In the radial-suppression edge detection, a non-separable wavelet trans-
form is used to extract the wavelet transform modulus of the iris image and then
radial non maxima suppression is used to retain the annular edges and simulta-
neously remove the radial edges. Then an edge thresholding is utilized to remove
the isolated edges and determine the final binary edge map.
Now circular Hough Transformation [2] is used to detect final iris boundaries
and deduce their radius and center. The Hough transform [2] is defined, as in (2),
for the circular boundary and a set of recovered edge points xj, yj (with j=1,...,n).
476 S. Kundu and G. Sarker
H , , = h x ,y ,x ,y ,r (2)
1, g x ,y ,x ,y ,r = 0
where, h x ,y ,x ,y ,r = (3)
0,
with, g x ,y ,x ,y ,r = ( ) ( ) (4)
For each edge point (xj, yj), g(xi, yj, xc, yc, r) = 0 for every parameter (xc, yc, r) that
represents a circle through that point. The triplet maximizing H corresponds to the
largest number of edge points that represents the contour of interest.
• Extract the iris: In this step, we remove the other parts of the eyes images such as
eyelids, eyelashes and eyebrows and extract the iris.
• Conversion into Binary images: This process converts the image into binary
image (2D matrix file).
• Image Normalization: This process normalizes all patterns into lower dimensions
and of same size.
• Conversion of binary images into 1D matrix: In the last step of preprocessing,
2D matrix iris files are converted into 1D matrix files. This set is the input to the
Optimal Clustering Algorithm (OCA).
Face Images
The face images also have to be preprocessed before learning as well as recognition.
The preprocessing steps for face images are almost same as fingerprint preprocessing
i.e. removal of noise, de blurring, background removal, conversion of RGB images to
grayscale images, conversion of gray scale images to binary images, image normali-
zation and finally conversion of binary images into 1D matrix files. The difference is
at background elimination step. Here, in this step the background of face patterns are
removed using Gaussian Model. Here a sum-of-Gaussians model (2 Gaussians) is
used for the color at each pixel results in a simple way of separating the background.
cluster mean), which is a measure of the degree of similarity among the members of
the same clusters and that of dissimilarity among members of the different clusters.
Back Propagation (BP) Network
The training of a network by Back Propagation (BP) [4], [5], [6], [8], [9], [12] in-
volves three stages. The feed forward of the input training pattern, the calculation and
back propagation of the associated error and the adjustment of the weights. After
training, application of the net involves only the computations of activation of the
feed forward phase. The output is calculated using the current setting of the weights in
all the layers. The weight update is given by,
( 1) = ( ) ∆ ( ) (5)
Here “t” is the index to indicate the presentation step for the training pattern at step
t (iteration number) and “ƞ” is the rate of learning.
In the BP learning algorithm [9],[12] of our developed system initialization of ƞ
are different for three individual classifiers.
The optimum weight is obtained when:
( 1) ( ) (7)
Ʀ=w |x µ | w ( |x µ |) (8)
A hidden neuron is more sensitive to data points near its center. For Gaussian RBF
this sensitivity may be tuned by adjusting the spread σ, where a larger spread implies
less sensitivity. So, the output of the hidden units,
( )
= (9)
Identification Learning
We use three training databases for three individual classifiers. Each database consists
of different image patterns i.e. fingerprint, iris and face. After preprocessing, all the
patterns are fed individually as input to the RBFN of individual classifiers. When the
networks have learned all the different patterns (fingerprint, iris, face) of different
qualities or expressions and of different angles or views for all different people, the
network is ready for identification of learned image patterns Fig.3.
We have used three different training databases for three different classifiers which
contains three different people’s fingerprint, iris and face images. In fingerprint data-
base for each person’s finger print there are three different qualities of fingerprints
and also three different angular (0, 90, 180 degrees) finger prints. In iris database left
and right eye images of three different qualities for each person has been taken. Final-
ly in face database for each person’s face, three different expressions and also three
different angular views of face images i.e. frontal view, 90 degree left side view and
90 degree right side view are taken (refer Fig.5, Fig. 6, Fig,7).
Fig. 6. Samples of few training Iris images Fig. 7. Samples of few training Face images
The test set for label testing to evaluate the performance of individual classifiers and
test set for unlabeled testing contains three different people’s (same as training data
set) fingerprint, iris and face patterns of various qualities or expressions. These
patterns are completely different from training set. Additional unknown person’s pat-
terns, which are not used for training, have also been included into the test database.
Fig. 8. Set of few Test Fingerprint Fig. 9. Set of few Test Iris
The test sets for label testing to evaluate the performance of super classifier and also for un-
labeled testing contains pattern set (i.e. one fingerprint, one iris and one face images) for three
different people (same as training data set) of various qualities or expressions. These patterns
and also some additional unknown people’s pattern set are completely different from training
set (refer Fig. 8, Fig. 9, Fig. 10, Fig. 11).
Fig. 10. Set of few Test Face Fig. 11. A test set for Super classification
An Efficient Multiple Classifier Based on Fast RBFN for Biometric Identification 481
Classifiers Accuracy
First Classifier (Fingerprint) 91.67%
Second Classifier(Iris) 94.44%
Third Classifier(Face) 91.67%
Super Classifier 97.22 %
4 Conclusion
A multiple classification system has been designed and developed using a modified
Radial Basis Function Network (RBFN) with Optimal Clustering Algorithm (OCA)
and BP learning. This multiple classifier consists of three individual classifiers for
Fingerprint, Iris and Face identification. The Super classifier concludes the decision
for identification considering these classifier’s result based on voting logic. This sys-
tem can identify the authenticated person. Due to the application of our OCA based
modified RBFN, the individual learning systems are fast. The performance in terms of
accuracy with Holdout method in each classifier is moderately high. Also the training
and testing time are moderately low for different types of fingerprints, iris, and face
482 S. Kundu and G. Sarker
patterns. Proposed RBFN multiple classification is efficient, effective and faster com-
pared to other conventional multimodal identification technique.
References
1. Huang, J., You, X., Tang, Y.Y., Du, L., Yuan, Y.: A novel iris segmentation using radial-
suppression edge detection. Signal Processing 89, 2630–2643 (2009)
2. Conti, Y., Militello, C., Sorbello, F.: A Frequency-based Approach for Features Fusion in
Fingerprint and Iris Multimodal Biometric Identification Systems. IEEE Transactions on
Systems, Man, and Cybernetics—Part C: Applications and Reviews 40(4), 384–395 (2010)
3. Pudi, V.: Data Mining. Oxford University Press, India (2009)
4. Sarker, G.: A Multilayer Network for Face Detection and Localization. IJCITAE 5(2), 35–
39 (2011)
5. Revathy, N., Guhan, T.: Face Recognition System Using Back propagation Artificial
Neural Networks. IJAET III(I), 321–324 (2012)
6. Sarker, G.: A Back propagation Network for Face Identification and Localization. Ac-
cepted for publication in ACTA Press Journals 202(2) (2013)
7. Aziz, K.A.A., Ramlee, R.A., Abdullah, S.S., Jahari, A.N.: Face Detection using Radial Ba-
sis Function Neural Networks with Variance Spread Value. In: International Conference of
Soft Computing and Pattern Recognition, pp. 399–403 (2009)
8. Sarker, G., Kundu, S.: A Modified Radial Basis Function Network for Fingerprint Identifi-
cation and Localization. In: International Conference on Advanced Engineering and Tech-
nology, pp. 26–31 (2013)
9. Kundu, S., Sarker, G.: A Modified Radial Basis Function Network for Occluded Finger-
print Identification and Localization. IJCITAE 7(2), 103–109 (2013)
10. Baig, A., Bouridane, A., Kurugollu, F., Qu, G.: Fingerprint – Iris Fusion based Identifica-
tion System using a Single Hamming Distance Matcher. In: Symposium on Bio-Inspired
Learning and Intelligent Systems for Security, pp. 9–12 (2009)
11. Lahane, P.U., Ganorkar, S.R.: Fusion of Iris & Fingerprint Biometric for Security Purpose.
International Journal of Scientific & Engineering Research 3(8), 1–5 (2012)
12. Bhakta, D., Sarker, G.: A Rotation and Location Invariant Face Identification and Locali-
zation with or Without Occlusion using Modified RBFN. In: Proceedings of the 2013
IEEE International Conference on Image Information Processing (ICIIP 2013), pp. 533–
538 (2013)
13. Gawande, U., Zaveri, M., Kapur, A.: Fingerprint and Iris Fusion Based Recognition using
RBF Neural Network. Journal of Signal and Image Processing 4(1), 142–148 (2013)
Automatic Tortuosity Detection and Measurement
of Retinal Blood Vessel Network
Sk. Latib1, Madhumita Mukherjee2, Dipak Kumar Kole1, and Chandan Giri3
1
St. Thomas’ College of Engineering & Technology, Kolkata, West Bengal, India
2
Purabi Das School of Information Technology, BESUS, Shibpur, West Bengal, India
3
Department of Information Technology, IIEST, Shibpur, West Bengal, India
{sklatib,madhumita07,dipak.kole,chandangiri}@gmail.com
Abstract. Increased dilation and tortuosity of the retinal blood vessels causes
the infant disease, retinopathy of prematurity (ROP). Automatic tortuosity eval-
uation is a very useful technique to prevent childhood blindness. It helps an
ophthalmologist in the ROP screening. This work describes a method for auto-
matically detection of a retinal image into low, medium or highly tortuous. The
proposed method first extracts a skeleton retinal image from the original retinal
image to get the overall structure of all the terminal and branching nodes of the
blood vessel based on morphological operations. Then, it separates each branch
and rotates it so that partitioning process is easier which follows a recursive
way. Finally, from the partitioned vessel segments, the tortuosity is calculated
and the tortuous symptom of the image is detected. The results have been com-
pared with the eye specialist’s analysis on twenty five images which gives good
result.
1 Introduction
Normal retinal blood vessels are straight or gently curved. In some diseases, the blood
vessels become tortuous, i.e. they become dilated and take on a serpentine path. The
dilation is caused by radial stretching of the blood vessel and the serpentine path oc-
curs because of longitudinal stretching. The tortuosity may be local; occurring only in
a small region of retinal blood vessels or it may involve the entire retinal vascular
tree. Many diseases such as high blood flow, angiogenesis and blood vessel conges-
tion produce tortuosity. In Retinopathy of Prematurity (ROP), the growth of retinal
vessels does not reach the periphery of the retina. This vessel is not fully vascularized.
So, preterm birth carries many complications. The early indicator of ROP is a whitish
gray demarcation line between normal retina and anteriorly undeveloped vascularized
retina. Hence the infants should be screened for ROP to receive appropriate treatment
for preventing permanent loss of vision.
Therefore, tortuosity measurement is a problem related to clinical correlation.
There are many methods to measure tortuosity. It is measured by based on the ratio
between arc length and the chord length [4]. They measured tortuosity using relative
length variation. Seven integral estimates of tortuosity based on the curvature of the
vessels are presented by Hart et al. [6]. For a better accuracy of the tortuosity mea-
surement, Buitt et al. [8] generalized Hart’s estimates to 3D images obtained by
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 483
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_56, © Springer International Publishing Switzerland 2014
484 S. Latib et al.
MRA. Grisan et al. [3] proposed another method satisfying all the tortuosity proper-
ties. Sukkaew et al. [5] present a new method based on partitioning and used the
tortuosity calculation formulae proposed by Grisan et al. [3].
But, all the above methods do not separate each branch of the retinal image and
calculate tortuosity of branches in skeleton image. So, these methods are complicated
and they couldn’t separate a single branch of a vessel. The method proposed by Suk-
kaew et al. [5], the maximum allowable interpolation point calculation is not fast and
takes large time. Again, these methods may not always give correct result. Because, a
circular arc with larger radius is non tortuous. Although, the ratio between arc length
and the chord length could be very large.
In this work, we have proposed a method to measure tortuosity of the retinal blood
vessels by breaking the images into branches and calculate tortuosity for every branch
curve separately. It rotates every curve to calculate maximum distance between
curves and chord easily and then produce a new formula for tortuosity measurement
which satisfies all the properties of tortuosity. The proposed method is also better than
the existing work as those methods could not separate each branch curve.
Rest of the paper is organized as follows. Section 2 explains some properties of
blood vessels that are used in the proposed method.Section3deals with the proposed
method for tortuosity detection. Section 4 presents results and explanation of such
results by comparing with eye specialist’s analysis. Section5 describes and overall
conclusion of the total process.
2 Background
Blood Vessels have many branches and furcation. It is necessary to find this furcation
and branches.
r l/2 L/2
r-h h
l/2
L/2
Affine transformations of a vessel are translation, rotations and scaling. These trans-
formations are related to the geographical position and orientation of vessels in the
retina, and do not alter the clinical perception of tortuosity.
2.3.2. Composition
Composition properties are dealt when two vessel curves are merged into a single
one, or when various segments of the same vessel with different tortuosity measures,
build up to give the total vessel tortuosity.
For two adjacent continuous curves s1 and s2, the combination of the two is defined as
= (2)
Since the two composing curves are belong to the same vessel, without loss of ge-
nerality it can be assumed the continuity of s3. Again a new composition property is
proposed, such that a vessel s, combination of various segments si, will have tortuosi-
ty measure greater than or equal to any of its composing parts:
T (si) ≤ T(s1⊕s2⊕….⊕sn). ∀ i=1….n , si⊆ s (3)
486 S. Latib et al.
2.3.3. Modulation
There exists a monotonic relationship with respect to two other properties. They are
frequency modulation at constant amplitude and amplitude modulation at constant
frequency. Frequency modulation specifies the rule that the greater the number of
changes in the curvature sign (twist) is, the greater the tortuosity of the vessel. Simi-
larly, amplitude modulation preserves that the greater the amplitude (maximum dis-
tance of the curve from the chord) of a twist is the greater is the tortuosity.
For two vessels having twists with the same amplitude, the tortuosity difference
changes simultaneously with the number of twists Ω:
3 Proposed Method
The proposed method is described in five sections as follows. The proposed method
deals with binary images because it takes less time and space for processing and
morphological image processing can be used easily which are efficient and fast in
extracting image features whose shape is known priori like vascular structure. Mor-
phological processing is also resistant of noise. The captured image is binarised to get
the blood vessel structure clearly. Binary image is used as the input to algorithm that
performs useful tasks. This algorithm can handle tasks ranging from very simple tasks
to much more complex recognition, localization, and inspection tasks.
stored in a matrix and in this recursive way all branches are stored respectively. In
Figure 4, some original images and extracted vessel skeletons from those images have
shown. In Figure 5, extracted branches from skeleton images have shown. In Figure 6,
shows an example after execution of steps for one branch.
Fig. 3. Process flow diagram of the proposed series of process to extract blood vessels’ skeleton
Fig. 4.(a),(b),(c),(d) and (e) the original images. (f),(g),(h),(i) and (j) are respectively detected
vessels’ skeleton
488 S. Latib et al.
Fig. 5.(a),(b),(c),(d) the skeleton images. (e),(f),(g),(h)are extracted branches respectively from
skeleton image
Automatic Tortuosity Detection and Measurement of Retinal Blood Vessel Network 489
where is chord of the main vessel branch and is the segment number and the total
number of segments. is kth curve and is kth chord of the segment. The tortuosity
measure has a dimension of 1/length that is interpreted as tortuosity density. Tortuos-
ity is measured by relative length variations of vessel curve and chord, where vessel
curve is partitioned in many small segments so that there is a small length variation of
each individual chord and segment. The equation 13 formula is applied for every
branch curve. The length( ) is the length of a segment of branch curve and
490 S. Latib et al.
4 Experimental Results
The above proposed method has tested on images obtained from an Eye Department
of Ramakrishna Mission Seva Pratisthan, Kolkata, West Bengal, India. The data base
contains 25 color retinal images with 384 X 512 pixels. Our result matches with the
visual detection. Also eye specialist of that hospital deliver comments after analyzing
those images and our result matches with their analysis.Figure9 shows the images
which are used with their tortuosity value. Here tortuosity for every branch is meas-
ured by the formula and maximum five tortuous branches taken into account to detect
how much tortuous the image is. From these, five branches mean(m) is derived.
Observations done tortuosity measurement by different methods like normal visua-
lization and Eye specialist’s analysis suggests that images which are highly tortuous
have tortuosity value greater than .045 and images which are low tortuous have tor-
tuosity value less than .035. Images which are medium tortuous have tortuosity value
between 0.035 and 0.045. For this reason, two thresholds .035 and .045 have been
chosen for tortuosity detection.
So, mean(m) is checked by the following conditions,
m<.035 ------> low tortuosity
.045 >m>.035 ------> medium tortuosity
m>.045 ------>high tortuosity
The experimental result of tortuosity measurement and detection is presented in
Table1.Throught the simulation threshold value, T=6 have been considered for every
image.
Automatic Tortuosity Detection and Measurement of Retinal Blood Vessel Network 491
Table 1 represents the tortuous symptom of some images by high, medium or low
range. According to eye specialists’ view high tortuosity generally occurs due to ob-
struction in the blood flow through the veins, which might again be caused due to
various diseases. Tortuosity, being merely a symptom can only be cured if the causing
disease is treated. Failure to do so ultimately leads to rupture of veins leading to he-
morrhage. Tortuosity may also occur in some people only as a normal variant, i.e., an
anomaly without any particular reason, may also be congenital. In Table 1, those
tortuous images are mentioned as medium or low range.
5 Conclusion
In this work, a method to measure tortuosity and produce a formula for tortuosity
measurement is proposed. The proposed formula meets all the features of tortuosity
i.e., the composition property is satisfied by summation, the amplitude modulation is
satisfied by the ratio of the arc length over the chord length for every turn curve, and
by means of the curve splitting, frequency modulation has satisfied. In this work, the
rotation operation helps to rotate each branch and measures maximum distance ac-
cording to threshold to divide the curve into segments. Again this method is threshold
based and partitioning process gives best result. The detection process depends on the
mean of the most tortuous segments of the blood vessel network. The main advantage
of this method is, it breaks the image into branches so that every branch curve may
use easily. So, from this aspect, this method is better than other existing methods. In
future, the tortuosity detection process will be further improved. Then more clinical
cases will be included in this study for early detection of disease. The techniques
employed in this study will help in diagnostic accuracy as well as in reducing the
workload of eye specialists.
492 S. Latib et al.
Acknowledgement. The authors would like to thank Dr. Pradip Kumar Pal, Senior
visiting surgeon, Eye Department, Ramakrishna Mission Seva Pratisthan, Kolkata for
providing images and clinical advice.
References
1. Jerald Jeba Kumar, S., Madheswaran, M.: Automated Thickness Measurement of Retinal
Blood Vessels for Implementation of Clinical Decision Support Systems in Diagnostic Di-
abetic Retinopathy. World Academy of Science, Engineering and Technology 64 (2010)
2. Hatanaka, Y., Nakagawa, T., Aoyama, A., Zhou, X., Hara, T., Fujita, H., Kakogawa, M.,
Hayashi, Y., Mizukusa, Y., Fujita, A.: Automated detection algorithm for arteriolar narrow-
ing on fundus images. In: Proc. 27th Annual Conference of the IEEE Engineering in Medi-
cine and Biology, Shanghai, China, September 1-4 (2005)
3. Grisan, E., Foracchia, M., Ruggeri, A.: A novel method for the automatic evaluation of re-
tinal vessel tortuosity. In: Proc. 25th Annual International Conference of the IEEE Engi-
neering in Medicine and Biology, Cancun, Mexico (September 2003)
4. Lotmar, W., Freiburghaus, A., Bracher, D.: Measurement of vessel tortuosity on fundus
photographs. Graefe’s Archive for Clinical and Experimental Ophthalmology 211, 49–57
(1979)
5. Sukkaew, L., Uyyanonvara, B., Makhanov, S.S., Barman, S., Pangputhipong, P.: Automatic
Tortuosity-Based Retinopathy of Prematurity Screening System. IEICE Trans. Inf. &
Syst. E91-D(12) (December 2008)
6. Hart, W., Golbaum, M., Cote, B., Kube, P., Nelson, M.: Measurement and classification of
retinal vascular tortuosity. Int. J. Medical Informatics 53, 239–252 (1999)
7. Sukkaew, L., Uyyanonvara, B., Barman, S., Fielder, A., Cocker, K.: Automatic extraction
of the structure of the retinal blood vessel network of premature infants. J. Medical Associa-
tion of Thailand 90(9), 1780–1792 (2007)
8. Bullitt, E., Gerig, G., Pizer, S., Lin, W., Aylward, S.: Measuring tortuosity of the intracere-
bral vasculature from MRA images. IEEE Trans. Med. Imaging 22(9), 1163–1171 (2003)
A New Indexing Method for Biometric Databases
Using Match Scores and Decision Level Fusion
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 493
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_57, © Springer International Publishing Switzerland 2014
494 I. Kavati et al.
The partitioning of the database into groups can be done by either classification
[3,4] or clustering [9]. Authors in [3], [4] partitions the database into predefined
groups or classes. The class of the query identity is first calculated and compared only
with the entries present in the respective class during the search process. However, the
classification methods suffers with, uneven distribution of images in the predefined
classes and image rejection [8]. On the other hand, clustering method organizes the
database into natural groups (clusters) in the feature space such that images in the
same cluster are similar to each other and share certain properties; whereas images in
different clusters are dissimilar [9]. Clustering method does not use category labels
that tag objects with prior identifiers, i.e., class labels.
The other approach to reduce number of matching for identification is indexing. In
traditional databases, records are indexed in an alphabetical or numeric order for effi-
cient retrieval. But biometric data do not have any natural sorting order to arrange the
records [9]. Hence, traditional indexing approaches are not suitable for biometric
databases. Biometric indexing techniques are broadly categorized into, a) point [5],
[6], b) triplet of points [11], [12], [13], and c) match score [14], [15], [16] based ap-
proaches. In [5], [6], authors extracted the key feature points of the biometric images
and mapped them into a hash table using geometric hashing. Authors in [11], [12],
[13], computed the triplets of the key feature points and mapped them into a hash
table using some additional information. However, the limitation of these indexing
methods is that all of them deal with variable length feature sets which make the iden-
tification system statistically unreliable.
In recent years, indexing techniques based on fixed length match scores also inves-
tigated for biometric identification. Maeda et al [14], computes a match score vector
for each image by comparing it against all the database images and stored these vec-
tors permanently as a matrix. Though, the approach achieves quicker response time, it
takes linear time in worst case and also storing of match score matrix leads to increase
in the space complexity. Gyaourova et al. [15] improved the work on match scores by
choosing a small set of sample images from the database. For every image in the da-
tabase a match score vector (index code) was computed by matching it against the
sample set using a matcher and stored this match score vector as a row in an index
table. However, a sequential search is done in the index space for identification of
best matches which takes linear time and is prohibitive for a database containing mil-
lions of images. Further, authors in [16],used Vector Approximation (VA+) file to
store the match score vectors and k-NN search, palm print texture to retrieve best
matches. However, the performance of VA+ file method generally degrades as
dimensionality increases [17]. To address these problems, this paper proposes an
efficient clustering-based indexing technique using match scores. We compute a
fixed-length index code for each input image based on match scores. Further, we pro-
pose an efficient storage and retrieval mechanism using these indices.
Rest of the paper is organized as follows. The proposed indexing technique has
been discussed in Section 2. Section 3 describes the proposed retrieval technique.
Section 4 presents the experimental results and performance of the proposed system
against other indexing methods in the literature. Conclusions are given in Section 5.
A New Indexing Method for Biometric Databases Using Match Scores 495
This section discusses our proposed methodology for indexing the biometric databas-
es. Let S = {s1, s2,..., sk} be the sample image set, and Mx = {m(x,s1), m(x,s2), …
m(x,sk)} be the set of match scores obtained for an input image x against each sample
image in S. We describe Mx as the index code of image x i.e., the index code of an
image is the set of its match scores against the sample set. The match score between
two images is computed by comparing their key features in Euclidean space. The
match scores obtained are usually in the range 0-100.
Further, we store the index code of each individual in a 2D Index table A Fig. 1.
Each column of the table corresponds to one sample image in the sample set. If image
x has a match score value m(x,si) with sample image si its identity (say x) is put in
location A(m(x,si), si). It can be seen from Fig. 1 that, each entry of the table
A(m(x,si), si) contains a list of image identities (i.e., IidList) from the database whose
match score is m(x,si) against sample image si.
The motivation behind this concept is that, images that belong to a same user will
have approximately similar match scores against a third image (say sample image si).
Let q be a query image, we can thus determine all similar images from this index table
by computing its match score against the sample image and selecting all images
(i.e.,IidList) that have approximately similar match score against the sample image.
leader. At any step, the algorithm assigns the current image to the most similar cluster
(leader) or the image itself may get added as a leader if its match score similarity with
the current set of leaders does not qualify based on the user specified threshold. Final-
ly, each cluster will have a listing of similar biometric identities and is represented
with an image called leader. The found set of leaders acts as sample set of the data-
base. The major advantage of dynamic clustering (such as leader algorithm) is that,
new enrollments can be done with a single database scan and without affecting the
existing clusters which is useful for clustering and indexing large databases.
This section proposes an efficient retrieval system to identify a query image. Fig. 2
shows the proposed method of identification. When a query image is presented to the
identification system, the technique retrieves the candidate identities from the clusters
as well as from the 2D Index table which are similar to the query. Finally, the pro-
posed system fuses the candidate identities (evidences) of both strategies to achieve
better performance.
Although there are other strategies like multi-biometrics [18], [19] (such as multi-
sensor, multi-algorithm, multi-sample, etc.) to retrieve multiple evidences for personal
identification, we want to make a full use of the intermediate results in the process of
computing index code in order to reduce the computational cost. It is easy to see from
the Fig. 2 that, when computing the index code for a query image to identify the poss-
ible matches (Candidate list2) from the index table, we can get set of match scores
against cluster representatives. Using them, we can also retrieve the candidate identi-
ties (Candidate list1) as additional evidence from the selected clusters whose repre-
sentative match score greater than a threshold.
Let G = {g1, g2,..., gk} be the set of clusters, and Mq = {m(q,s1), m(q,s2), … m(q,sk)}
be the index code of a query image q where m(q,si) be the match score value of q
against sample image si. The retrieval algorithm use (m(q,si), si) as index to the Index
table and retrieve all the images (IidList) found in that location as similar images to
the query into a Temporary list. In other words we retrieve all the images from the
index table whose match score value against the sample image is equal to the query
image. We also retrieve images from the predefined neighborhood of the selected
location into the Temporary list. Finally, we give a vote to each retrieved image. Fur-
ther, we also retrieve images from cluster gi as similar images to the query, if m(q,si)
≥ similarity threshold i.e., clusters are selected whose representative is similar to the
query image. We store the retrieved cluster images into Candidate list 1. We repeat
this process for each match score value of the query index code. In our next step, we
accumulate and count the number of votes of each name in Temporary list. Finally,
we sort the all the individuals in descending order based on the number of votes re-
ceived and select the individuals whose vote score greater than a predefined threshold
into Candidate list 2.
A New Indexing Method for Biometric Databases Using Match Scores 497
The performance of uni-modal biometric systems may suffer due to issues such as
limited population coverage, less accuracy, noisy data and matcher limitations [18].
To overcome the limitations of uni-modal biometrics and improve the performance,
fusion of multiple pieces of biometric information has been proposed. Fusion can be
performed at different levels such as data, feature, match score and decision level. In
this paper, we use decision level fusion method. With this method, the decisions
output (candidate identities) obtained from the cluster space and Index table are
combined using, a) union of candidate lists, b) intersection of candidate lists.
The union fusion scheme combines the candidate list of the individual techniques.
This fusion scheme has the potential to increase the chance of finding correct identity
even if the correct identity is not retrieved by some of the techniques i.e., the poor
retrieval performance of one technique will not affect the overall performance. How-
ever, this scheme often increases the search space of the database. With intersection
fusion scheme, the final decision output is the intersection of the candidate lists of the
498 I. Kavati et al.
individual techniques. This type of fusion can further reduce the size of the search
space. However, the poor retrieval performance of one technique will affect the over-
all performance of the system.
4 Experimental Results
We experimented our approach on benchmark PolyU palm print database [10] consist
of 7,752 gray-scale images, approximately 20 prints each of 386 different palms. The
first ten images of these prints is used to construct the database, while the other ten
images are used to test the indexing performance. We segment the palm images to
151 × 151 pixels and use the Scale Invariant Feature Transform (SIFT features [2]) to
compute the match score between images. Samples of some segmented images from
PolyU database is shown in Fig. 3.We evaluated the performance of the system with
different sample set sizes and chosen the optimum value as 1/3rd of the database.
We analyze the retrieval time of our algorithm with big-O notation. Let q be the query
image, k be the number of sample images chosen, and N be the number of enrolled
user in the database. To retrieve the best matches for a query image, our algorithm
A New Indexing Method for Biometric Databases Using Match Scores 499
computes the match score of query against the each sample image and retrieves the
images, a. from the index table whose match scores against that sample image are
nearer to query, b. as well as from respective cluster of the sample image if its match
score is greater than similarity threshold. This process takes O(1) time. However,
there are k sample images, so the time complexity our algorithm is O(k). On the hand
linear search methods requires O(N). Thus our approach takes less time than the linear
search approach as k<<N.
The proposed technique has been compared with existing match score base indexing
techniques [15] [16]. The performance of proposed technique against technique in
[15] can be seen from the Fig. 4(b). It is seen that our algorithm performs with less PR
compared to the [15]. Further, authors in [15] performed linear search over the index
space to retrieve the best matches. This process takes considerable amount of time i.e.
O(N). Finally, the system in [16] achieves only a maximum of 98.28% HR for
PolyU database. It can be inferred that the proposed system enhanced the indexing
performance.
(a) (b)
Fig. 4. HRVs PR of different techniques on PolyU database, (a) Our proposed methods, (b)
Comparison with Ref [15]
5 Conclusions
In this paper, we propose a new clustering based indexing technique for identification
in large biometric databases. We compute a fixed length index code for each biome-
tric image using the sample images. Further, we propose an efficient storing and
searching method for the biometric database using these index codes. We efficiently
used the intermediate results in the process of computing index code that retrieve
multiple evidences which improves the identification performance without increasing
computational cost. Finally, the results shows the efficacy of our approach against
state of art indexing methods. Our technique is easy to implement and can be applied
to any large biometric database.
500 I. Kavati et al.
References
1. Jain, A.K., Pankanti, S.: Automated fingerprint identification and imaging systems. In:
Advances in Fingerprint Technology, 2nd edn. Elsevier Science (2001)
2. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Jour-
nal on Computer Vision 60, 91–110 (2004)
3. Henry Classification System International Biometric Group (2003), http://
www.biometricgroup.com/HenryFingerprintClassification.pdf
4. Wu, X., Zhang, D., Wang, K., Huang, B.: Palmprint classification using principal lines.
Pattern Recognition 37, 1987–1998 (2004)
5. Boro, R., Roy, S.D.: Fast and Robust Projective Matching for Finger prints using Geome-
tric Hashing. In: Indian Conference on Computer Vision, Graphics and Image Processing,
pp. 681–686 (2004)
6. Mehrotra, H., Majhi, B., Gupta, P.: Robust iris indexing scheme using geometric hashing
of SIFT keypoints. Journal of Network and Computer Applications 33, 300–313 (2010)
7. Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Sur-
veys 31(3) (1999)
8. Maltoni, D., Maio, D., Jain, A.K., Prabhakar, S.: Handbook of Fingerprint Recognition.
Springer (2003)
9. Mhatre, A., Palla, S., Chikkerur, S., Govindaraju, V.: Efficient search and retrieval in bio-
metric databases. Biometric Technology for Human Identification II 5779, 265–273 (2005)
10. The PolyU palmprint database, http://www.comp.polyu.edu.hk/biometrics
11. Bhanu, B., Tan, X.: Fingerprint indexing based on novel features of minutiae triplets. IEEE
Transactions on Pattern Analysis and Machine Intelligence 25, 616–622 (2003)
12. Kavati, I., Prasad, M.V.N.K., Bhagvati, C.: Vein Pattern Indexing Using Texture and Hie-
rarchical Decomposition of Delaunay Triangulation. In: Thampi, S.M., Atrey, P.K., Fan,
C.-I., Perez, G.M. (eds.) SSCC 2013. CCIS, vol. 377, pp. 213–222. Springer, Heidelberg
(2013)
13. Jayaraman, U., Prakash, S., Gupta, P.: Use of geometric features of principal components
for indexing a biometric database. Mathematical and Computer Modelling 58, 147–164
(2013)
14. Maeda, T., Matsushita, M., Sasakawa, K.: Identification algorithm using a matching score
matrix. IEICE Transactions in Information and Systems 84, 819–824 (2001)
15. Gyaourova, A., Ross, A.: Index Codes for Multi biometric Pattern retrieval. IEEE Transac-
tions on Information Forensics and Security 7, 518–529 (2012)
16. Paliwal, A., Jayaraman, U., Gupta, P.: A score based indexing scheme for palmprint data-
bases. In: International Conference on Image Processing, pp. 2377–2380 (2010)
17. Weber, R., Schek, H., Blott, S.: A quantative analysis and performance study for similarity
search methods in high-dimensional spaces. In: Proceedings of the 24th Very Large Data-
base Conference, pp. 194–205 (1998)
18. Ross, A., Nandakumar, K., Jain, A.K.: Handbook of Multibiometrics. Springer, New York
(2006)
19. Kumar, A., Shekhar, S.: Personal identification using multibiometrics rank-level fusion.
IEEE Transactions on Systems, Man and Cybernetics 41, 743–752 (2011)
Split-Encoding: The Next Frontier Tool for Big Data
1 Motivation
Immense data is generated from our day to day operations. As billions of devices are
connected to networks, possibility of congestion increases substantially, thus
requiring new ways to reduce network traffic and enhance data storage capacity.
Gaming and medical images are generating major Peer to Peer traffic on today's
Internet. In North America, 53.3 % of network traffic is contributed by P2P
communication [13, 14]. Today, even though there are uncountable techniques
available to compress and decompress data before transmitting on the Internet.
Implementing various security protocols and techniques adds additional traffic on the
network, which affects the overall functioning of the network. In this paper, we
propose a simple data encoding technique that resides on top of existing data
compression techniques and security protocols implemented with a Split - protocol.
Critical medical conditions like cardiology and neurological traumas need
complete information in the minimum amount of time, so that a patient’s life can be
spared [6]. As shown in Fig. 1, any medical event will trigger transmission of medical
data. The data servers (DSs) send relevant data and images simultaneously to the
respective team member. The cardiologist will have all necessary information related
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 501
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_58, © Springer International Publishing Switzerland 2014
502 B. Rawal et al.
to cardiology, while the radiologist will receive all data such as X-ray images, MRI,
CT-Scan etc.
In this paper, we propose the implementation of the Split - encoding approach on
top of exiting data compression techniques for faster transmission of medical images
and securing database servers in the health information system. The physicians using
web browsers with no direct access to the PACS information system will encounter
slower network speeds [15].
2 Introduction
To demonstrate the split protocol concept, we describe web server and client
applications that are designed and implemented using a bare machine computing
(BMC) paradigm [18], also known as dispersed operating system computing (DOSC).
Bare PC applications are easier and friendlier to use in the demonstration. In the BMC
model, operating system (OS) or kernel is completely eliminated and application
objects (AO) [3] carry their own network stack and drivers to move independently on
a bare PC. All BMC applications are self-contained, self-managed and self-
executable, therefore granting the application programmer to solely control the
applications and their execution environments. [1 - 7], and [12, 17, 19] have been
presented on the BMC concept.
The rest of the paper is organized as follows. Section 3 discusses related work.
Section 4 describes design and implementation. Section 5 outlines the Split-encoding
schemes. The Section 6 presents experimental results and Section 7 contains the
conclusion.
3 Related Work
The TCB tables form the key system component in the client and server designs. A
given entry in this table maintains complete state and data information for a given
request. The inter-server packet (ISP) is based on this entry to be shipped to a DS
when a GET message arrives from the client. We use only 168 Byte ISP packets to
transfer the state of a given request [4].
For proof of concept we consider a partial encoding technique for a 4.71GB JPG file,
which is stored in memory as collections of binary 0s and 1s.
In the very first step, we divide a whole chunk of data into various 64 bit strings
resulting in 2^64 possible string combinations.
We selectively assign the string number to S1 to 1111111111111…11 all 64 1s,
S2 to a string of (1111111111111…10) (63 1s and last one zero), S3
(1111111111111…100) (62 1s and last two 0s) and so on up to # 16777216 (possible
combinations in a 24 bit number). Considering the fact that our data are generated
using 128 ASCII character set, the total combinations of the 8 byte string are 128^8 =
72057594037927936.It is worth mentioning that these strings can be formed with
repeated characters (e.g. aaaaaaaa). However, if non-printable characters and rarely
used characters are reasonably removed from the 128 ASCII character set, the total
possible combinations of 8 byte strings can reasonably be reduced to 96^8. We are
limiting our number assigned to any string to be up to maximum 16777216. The total
combinations of any 64 bit data will approximately be 96^8 = 7213895789838336
If any 64 bit combination is different from our selected 16777216 string
combinations, we will transmit those 64 bits as it is without encoding. Considering the
fact that in real life communication, we use certain fixed patterns based on language
and numbering system, the possibility of non-occurrence reduces and the possibility
of occurrence increases for a particular pattern. Whether the information is encoded or
ordinary is determined by a special bit in DM1 packet (DM1 packet is 168 Bytes
large) [4]. The regular data will be supplied by CS with the information sequence# of
a particular piece of 64 bit data. However, this type of situation occurs very rarely.
The Ethernet MTU is 1500 bytes, and can contain up to 1460 byte payload. To
transfer 4.71GB file will require 3463921 Ethernet frames (packets). Each frame# will
correspond to a chronological order of string numbers. We assign the unique string
number of 24 bits to 64bit data string made of 1s and 0s (keeping in mind that we are
only encoding partial data)
For every individual Ethernet frame, we can send up to 486 unique string (pattern)
numbers (1460 bytes). We can make DS1 to send odd numbered frames and DS2 send
even numbered frames (i.e. If a DS1 sends frame #1, 3, 5… and DS2 will send
frame#2, 4, 6, … and so on.
Based on the frame sequence number, receiving end will arrange all Ethernet
frames. Since each frame contains 486 strings of 8 Bytes, when receivers decode it
will convert into 486×8 =3888 Bytes data. Normally one frame sends 1460 Bytes of
data (excluding 20 byte IP header and 20 byte TCP header). So the encoded data
Split-Encoding: The Next Frontier Tool for Big Data 505
transaction is 2.66 times more than a normal data transaction. In other words, we are
sending only 38% frames, that mean it will reduces 62% of network traffic.
Typically in our experiment we encountered very few non-encoded data strings.
However, when we increase the data size in the Terabyte the probability of
occurrence of non-encoded string is higher.
Let us compute the probabilities of occurrence of two strings #1 and #2.
Assume probabilities of occurrence of two strings #1 and #2 are P (1) and P (2)
respectively.
Neither the probability of string #1 nor string #2 is affected by the occurrence (or
an occurrence) of the other event, so the two events should be independent.
Therefore, the conditional probability of occurrence for strings #1 and #2 is as
follows:
P (1|2) = P (1), which is the same as
P (2|1) = P (2)
Similarly, the probability of occurrence of two strings #1 and #2 is as follows:
Based on the above information, the occurrence of any string does not affect the
occurrence other string. The total available numbers to be assigned to any 64 bit
piece of data are 96^8 = 7213895789838336. So the probability of any one piece of
64bit data to be assigned a number is
P = (2^24) / (96^8) = 2.32568e-9. (2)
So the probability of any one piece of 64 bit data NOT to be assigned a number is
NP= 1-P= 1- 2.32568e-9 = 0.999999998. (3)
For given set of strings S1, S2… Sr (r=16777216) and at length k= 64, r strings
occur independently of each other in a random fashion. This 64bit string is made of m
substring length k (m= 8, k= 8). Substrings also, occur in independently of each other
and in random fashion.
The probability that there is a string T of length k that is a substring of each of S1,
S2… Sr (r=16777216).
According to Double Independence Model.
For proof of concept we took 4.71GB data JIF file, there are 632165499 strings (64
bits per string). The probability of occurrence of any string is 2.32568e-9.
For the majority of the time the data we are use falls within those 16777216 string
numbers and in our data block many strings appear repeatedly. To transfer 4.71 GB
file requires 3463921 Ethernet frames, if a frame carries 1460 byte data.
We estimate the following 2 numbers (encoded and non-encoded) in the way
below:
3888*x + 1460*y = 4.71*1024*1024*1024
x/y = 124820/138575
So # of encoded frames x =918029; the non-encoded # y=1019195.
In our typical experiment we found 918370 frames were encoded and 1019400
were non encoded frames. We have noticed that there are multiple frames for encoded
as well as non encode frame received in the data. Transferring entire file it took only
875 seconds as compared to 1400 second with our previous MC/MS results it reduces
almost 35.6% transmission time and 60% network traffic.
//convert from binary to decimal
public static String convertBinToDec( String s )
{
//check
if ( isNumber(s,2) == false ) return "-1";
BigInteger bi = new BigInteger(s, 2);
return bi.toString();
} //convert from decimal to binary
public static String convertDecToBin( String s )
{
//check
if ( isNumber(s, 10) == false ) return "-1";
BigInteger dec = new BigInteger(s, 10);
return dec.toString(2);. BigDec2Bin conversion
method B [20].
When we scan the 64 bit string, there can be only two possibilities. Either it does
exist in our database with the corresponding string number or doesn’t. These events
are independent, and if one of the two outcomes is defined as a success, then the
probability of exactly x successes out of N trials (events) are given by.
5 Experimental Results
The experimental setup involved a prototype server cluster consisting of the Dell
OptiPlex 790 PCs with Intel R-core (TM) i5-2400 CPU @3.10GHz Processor, 8 GB
RAM, and an Intel 1G NIC on the motherboard. All systems were connected to a
Linksys 16 port 1 Gbps Ethernet switch. Bare PC clients were used to stress test the
servers. The bare PC Web clients capable of generating 5700 requests/Sec were used
to create workload.
Fig. 2 represents the comparison between legacy BitTorent and regular server
systems. It can be seen that this configuration, TCP/Http transaction time for the Split
- encode system of five DS servers offers factor of six times faster data transmission
compared to legacy Bit Torrent. This is a significant reduction in data transmission
time.
Fig. 3 illustrates the Image Retrieval Time for CR Chest 7.1 MB files for System
of 3 Servers. We can notice that there is little or no significant improvement over
MC/MS System. That means for smaller file sizes Split-encoding do not offer any
advantages over other techniques.
.
Fig. 3. Image Retrieval Times for CR Chest 7.1 MB file for System 3 Servers [6]
As shown in Fig. 4. MC/MS or Split-encode dose not add any advantages over
other not suitable for smaller data sizes.
Fig. 4. Compare Query/Retrieval Times for the First Image from Local Grid Node and Web
PACS using ARAM [20]
Split-Encoding: The Next Frontier Tool for Big Data 509
6 Conclusion
References
1. Rawal, B., Karne, R., Wijesinha, A.L.: Splitting HTTP Requests on Two Servers. In: Third
International Conference on Communication Systems and Networks (2011)
2. Rawal, B., Karne, R., Wijesinha, A.L.: Insight into a Bare PC Web Server. In: 23nd
International Conference on Computer Applications in Industry and Engineering (2010)
3. Karne, R.K.: Application-oriented Object Architecture: A Revolutionary Approach. In:
6th International Conference, HPC Asia (2002)
4. Rawal, B., Karne, R., Wijesinha, A.L.: Split Protocol Client Server Architecture. In: 17th
IEEE Symposium on Computers and Communication (2012)
5. Rawal, B., Karne, R., Wijesinha, A.L.: A Split Protocol Technique for Web Server
Migration. In: 2012 International Workshop on Core Network Architecture and Protocols
for Internet (2012)
6. Rawal, B.S., Berman, L.I., Ramcharan, H.: Multi-Client/Multi-Server Split Architecture.
In: The International Conference on Information Networking (2013)
7. Rawal, B.S., Phoemphun, O., Ramcharn, H., Williams, L.: Architectural Reliability of
Split-protocol. In: IEEE International Conference on Computer Applications Technology
(2013)
8. Yang, C.-T., Lo, Y.-H., Chen, L.-T.: A Web-Based Parallel File Transferring System on
Grid and Cloud Environments. In: International Symposium on Parallel and Distributed
Processing with Applications (2010)
9. NVIDIA. Nvidia Cuda Compute Unified Device Architecture. Programming Guide v. 2.0
(2008)
10. Bhadoria, S.N., Aggarwal, P., Dethe, C.G., Vig, R.: Comparison of Segmentation Tools
for Multiple Modalities in Medical Imaging. Journal of Advances in Information
Technology 3(4), 197–205 (2012)
11. Alagendran, B., Manimurugan, S.: A Survey on Various Medical Image Compression
Techniques. International Journal of Soft Computing and Engineering 2(1) (2012)
12. Rawal, B., Ramcharan, H., Tsetse, A.: Emergent of DDoS Resistant Augmented Split
Architecture. In: IEEE 10th International Conference HONET- CNS (2013)
510 B. Rawal et al.
13. Lua, E.K., Crowcroft, J., Pias, M., Sharma, R., Lim, S.: A Survey and Comparison of
Peer-to-Peer Overlay Network Schemes. IEEE Journal of Communications Surveys &
Tutorials 7(2), 72–93 (2005)
14. http://torrentfreak.com/bittorrent-still-dominates-global-
internet-traffic-101026/
15. Koutelakis, G.V., Lymperopoulos, D.K.: A grid PACS architecture: Providing data-
centric applications through a grid infrastructure. In: Proceedings of the 29th Annual
International Conference of the IEEE EMBS (2007)
16. https://code.google.com/p/snappy/
17. Karne, R.K.: Application-oriented Object Architecture: A Revolutionary Approach. In:
6th International Conference, HPC Asia (2002)
18. [Online] http://vassarstats.net/binomialX.html
19. Ford, G.H., Karne, R.K., Wijesinha, A.L., Appiah-Kubi, P.: The Performance of a Bare
Machine Email Server. In: Proceedings of 21st International Symposium on Computer
Architecture and High Performance Computing, pp. 143–150 (2009)
20. Ramesh, S.M., Shanmugam, A.: Medical image compression using wavelet
decomposition for prediction method. International Journal of Computer Science and
Information Security 7(1) (2010)
Identification of Lost or Deserted Written Texts
Using Zipf’s Law with NLTK
Devanshi Gupta, Priyank Singh Hada, Deepankar Mitra, and Niket Sharma
1 Introduction
Dealing with Natural Language Processing (NLP) is a fascinating and active research
field. With the help of Natural Language processing Toolkit (NLTK) module
available in Python, graphical analysis of natural languages has also become possible.
The graphical user interface present in NLTK helps to plot and study graphs to
understand the results of natural language processing in a much better an efficient
way. NLTK suits both linguists and researchers as it has enough theory as well as
practical practice examples in the NLTK book itself [1], [2]. NLTK is an extensive
collection of documentation, corpora and few hundreds exercises making it a huge
framework providing a better understanding of the natural language’s and their
processing. NLTK is entirely self-contained providing raw as well as annotated text
versions and also simple functions to access these [3], [4].
Moving to Zipf’s law; it is based on power law distribution which emphasizes that
if given a corpus containing utterances of a natural language, then the frequency of
any word in the given corpus is inversely proportional to the rank in frequency table
so formed. It implies that the word which is occurring most frequently will occur
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 511
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_59, © Springer International Publishing Switzerland 2014
512 D. Gupta et al.
approximately twice as the frequency of the second most frequently occurring word in
the same corpus and so on. The general Zipf’s law studies the influence relation that
frequency is suffered by rank where rank, is equivalent to the independent variable
and frequency can be looked upon as the dependent variable. It is observed that the
randomly generated distribution for frequency of words in texts is quite similar to
Zipf’s law. Occurrence frequency of any word is similar to the inverse power law
taking into consideration its word rank and the exponent of the inverse power law is
found almost very close to 1, because the transformation from the word’s length to its
corresponding rank stretches or expands an exponential behavior to power law
function behavior [5].
In this paper Zipf’s law has been deployed along with the available corpus in
NLTK module in python in order to generate plots. Further, analysis of these plots
have been done to show how a particular style of writing hampers a writer in his
writing and usage of words and grammar in the related context. The frequency of
words in various texts has been calculated and the results have been obtained in the
form of graphical plots which has helped to show the existence of similarity between
various texts by the same writer/ author.
2 Background Theory
Many independent works have already been done using either NLTK or Zipf’s law.
But, a combination of two has made no considerable contribution in the present
influenced world of natural language processing.
Several works has been done in NLP wherein two parsers have been fused together
[6]. In this paper the author has described how in present scenario where natural
language processing applications and its implementations are equally existent for the
programmers having null linguistic knowledge were being able to build some specific
NLP linguistic systems. Two parsers were fused together to achieve more accuracy
and generality were tested on a corpus showing a higher level of accuracy in the
results.
Other previous works in NLTK include a focus on computational semantics
software and how it is easy to do computational semantics with NLTK [7]. This paper
shows how Python gives an advantage to those students who are not exposed to any
programming language in the merits of Prolog and Python when compared relatively.
Zipf’s law was introduced by George Kingsley Zipf and proposed it in Zipf 1935,
1949. It is an empirical law which is formed using the mathematical statistics, and it is
referred that various kinds of data types have been studied in the fields of physics as
well as social sciences which can be aptly approximated in accordance with Zipfian
distribution, which is very much related to discrete mathematics probability
distributions of power law family.
On the other hand Zipf’s law like distribution have been implemented in various
real time and internet analysis, where an analysis has been done to check the revenue
Identification of Lost or Deserted Written Texts Using Zipf’s Law with NLTK 513
of 500 top firms’ of China and its rank and frequency following Zipf’s like
distribution with inverse power law where the slope is found to be close to 1 [8].
3 Motivation
Significant work has been done independently using NLTK or Zipf’s Law. This work
has tried to combine the above two where NLTK has been used to get some of the
texts of the Project Gutenberg to do the analysis and Zipf’s law has helped to plot and
study those analyses.
Project Gutenberg is an electronic text archive which preserves cultural text, which
is freely available to all and a small section of texts have been included in NLTK
creating a Gutenberg Corpus, which is a large body of text [9]. To see the distribution
of words and texts and the frequency of words used Zipf’s law has been implemented
on some of the texts of the Gutenberg Corpus in NLTK. The Gutenberg Corpus of
NLTK contains three texts by Chesterton, namely; Chesterton-ball, Chesterton-brown
and Chesterton-Thursday and three by Shakespeare, namely; Shakespeare-hamlet,
Shakespeare-Macbeth and Shakespeare-Caesar, amongst other texts in the corpus.
These are the various texts written and composed by the same author/ writer. The
word type and its frequency of occurring in the text singly and comparatively has
been checked using NLTK module and using Zipf’s law, comparative graph has been
plot to analyze the results of the research.
Here it is tried to see how much the writing is hampered, influenced and varies
according to the speech and types of words used by a particular author/writer.
Writing, usage of words, frequency of using words, adjectives, presenting those
phrases and everything differs and varies person to person. Even a common man has a
different way of writing essays and texts than another common man.
Often it is found that the problem of text kept or preserved anonymously,
sometimes it is done deliberately on author’s request but sometimes it’s not. This
valuable text or cultural texts’ identification becomes important so that its master’s
originality can be kept intact. This can be done by employing Zipf’s law on the text
and counting the words and its corresponding frequency in the respective text. Here
after plotting a comparative graph of the same with some identified text which is
thought to be written by the same author as this unidentified one.
The results can then show that whether the text has been identified is according to
one which is assumed or not.
A number of corpora from NLTK have been studied and various comparison
results have been plotted to study the Zipf’s law and its scope and slope over natural
languages and words from daily usage of various language but mainly English. As a
first step the frequency distribution is calculated and stored in a file for some nine
corpuses from a suite of corpuses in NLTK namely Reuters Corpus, Project
Gutenberg, Movie Reviews Corpus, State Union Corpus, Treebank, and Inaugural
Corpus from USA, Webtext Corpus, Nps Chat Corpus, and Cess Corpus. Frequency
occurrence of words, word, and its rank are also calculated in a file and finally a
comparison plot is generated among various corpuses and texts of NLTK.
514 D. Gupta et al.
4 Getting Started
Zipf’s law is explained as let f (w) be the frequency of a word w in any free text. It is
supposed that the most frequently occurring word is ranked as one, i.e. all words in a
text are ranked according to the frequency of occurrence in the text. According to
Zipf’s law; frequency of occurrence of a word of any type is inversely proportional to
its rank.
f α 1/r
f = k/r
f*r=k
where k is a constant.
Power law is when the frequency of any event varies as the power of some
attributes of that event the frequency is said to follow a power law. Significance of
power law here is taken in sense of logarithmic scale because values here are very
large. So to denote these values on scale, a power law is used in form of log scale to
demonstrate these values on x-y axes.
At first it is started with counting words in the text and calculating how many times
each word is appearing in the given text. Each word in each line is read by stripping
of the front and back whitespaces around the word, converting each word into
lowercase to make the things a little more manageable. A dictionary is maintained to
store all the words with their corresponding frequency.
Next step is assigning the rank to each word. Rank of a value is one plus the
number of higher values. This implies that if an item x is taken wherein it is said that
x is the fourth highest value in the list means its rank is four then it is said that there
are three other items whose values are higher than x. Therefore, to assign a rank to
each word by its corresponding frequency, it should be known that how many words
have higher frequency than that word. So grouping is to be done first i.e. group the
words according to their frequency. Certain boxes are taken and each box is labeled
according to the frequency of the word i.e. putting all words appearing three times in
a corpus, in a box labeled as three or 3 and so on. To check how many words have
higher frequency than a given word, the label of the box is checked to which a
selected word belongs to and then adding up the number of words in the boxes with
larger labels.
Now putting up all the words having the same frequency into a box representing
that box with the corresponding frequency, then to determine the rank, see which box
it belongs to, identify all boxes that keep words with higher frequency then add
number of words that belong to the box identified in the previous step and lastly add
one to the resulting sum. To put all words with the same frequency into a box with
that frequency, a dictionary is used and the dictionary created earlier is used to lookup
for the frequency of the word and a rank function is created with three parameters as
word, frequency of word and group of words in box dictionary.
It is also observed that many real systems don’t show true power law behavior
because they are either incomplete or possibly inconsistent with the undertaken
conditions under which it is expected for power laws to emerge. In general Zipf’s law
doesn’t hold for subsets of objects or events or a union or combination of Zipfian sets.
Some missing elements produce deviations from a pure Zipfian or Zipf’s law in the
Identification of Lost or Deserted Written Texts Using Zipf’s Law with NLTK 515
subset. The line is sometimes not well fit, for highest rank words and lowest rank
words which are overestimated and underestimated respectively.
5 Result Analysis
At first Zipf’s law is implemented on text 2: Sense and Sensibility by Jane Austen 1811
and text 4: Inaugural Address Corpus of NLTK and found that Zipf’s law is obeyed
with some deviations at the extremes and it is also seen that there are quite some
similarities in these two texts (Fig. 1(a)), i.e. the way in which the two texts are written
and the usage of words and its frequencies is quite similar and in fact the graph
intersects at a place near the lower mid signifying that some of the words usage is same.
Similarly implementing Zipf’s law on text 2: Sense and Sensibility by Jane Austen
1811 and text 8: Personals Corpus of NLTK, it is found that there are no similarities
between words of these two corpuses (Fig. 1(b)) i.e. the frequencies of using words is
not similar and a considerable gap is maintained among the two texts.
With the result analyses and comparison on these three texts of NLTK it is found
that how one text i.e. text 2 is quite similar to text 4 but the same text i.e. text 2 has
absolutely no similarities with text 8; while all the texts deviate at the extremities due
to underestimated or overestimated values at the two extremes respectively.
(a) (b)
Fig. 1. (a) Comparision results of text 2: Sense and Sensibility by Jane Austen 1811 and text 4:
Inaugural Address Corpus of NLTK, (b) Comaprision results of text 2: Sense and Sensibility by
Jane Austen 1811 and text 8: Personals Corpus of NLTK
Now analyzing the results of the main objective i.e. to see how well it works for the
texts (taken as sample) by Shakespeare and Chesterton, two great authors and writers
and how important it is too see a similarity in their existing texts so that the analyses of
these results can be used to test some valuable unidentified or anonymous text; where
these two tests and these two authors are just taken to check the results.
When simply the words of the two texts i.e. Chesterton Ball and Chesterton Brown,
composed by Chesterton are compared (no frequency count is considered), a graph with
almost similar and overlapping pattern is generated till the lower mid and a little
deviating in the later part (Fig. 2(a)), which shows that the words used are quite similar.
And when the comparison of these two texts according to the frequency of
occurrence of words is done it is found that the graph is completely overlapping
except for a few words at the lower extremity. This means that the frequency of using
words is very similar in these two texts by the same writer/ author (Fig. 2(b)).
516 D. Gupta et al.
(a) (b)
Fig. 2. (a) Comparision of two different texts written by Chesterton (Brown and Ball), (b)
Comparison according to the frequency of words in the text Ball and Brown by Chesterton
When the comparison for all the three texts available by Chesterton in Project
Gutenberg namely; Ball, Brown and Thursday is done, it is seen that in the
composition Thursday the frequency of words used is a little less though the pattern
followed is similar but at lower extremity, the frequency of words of text Thursday
matches with that of Brown. Here also the frequency of occurrence of words is
compared (Fig. 3).
Fig. 3. Comparison according to the frequency of words in the text Ball, Brown and Thursday
by Chesterton
In the figure below, the results say that when the frequency of occurrence of words
in both the compositions by Shakespeare are compared i.e. Shakespeare_Macbeth and
Shakespeare_caesar, it is noticed that the words usage is almost similar in both the
texts and also the graph does not touch the axes at the lower end, which means that
there are no words in the text that are used almost negligibly or with very less
frequency and hence the usage of words is distributed in a certain pattern (Fig. 4(a)),
the words and their frequency count is used for the analyses.
When all the three plays by Shakespeare namely; Caesar, Macbeth and Hamlet,
available in NLTK’s Project Gutenberg, when included to test the results on the basis
of the frequency of occurrence of words, it was noticed that the graph for text Hamlet
touches the axis at the lower end signifying that words usage in play Hamlet is
different and is not used very frequently making it touch the lower axis. It is also
noticed that the graph for Hamlet follows the similar pattern as that of the other two
Identification of Lost or Deserted Written Texts Using Zipf’s Law with NLTK 517
but the words used in Hamlet are with a higher frequency (Fig. 4(b)), this means the
pattern followed by a particular writer is similar though the frequency of using words
may vary.
(a) (b)
Fig. 4. (a) Comparison according to the frequency ofoccurence of words with their
corresponding ranks in the text Caesar and Macbeth by Shakespeare, (b) Comparison according
to the frequency of occurrence words with their corresponding ranks in the text Caesar, Hamlet
and Macbeth by Shakespeare
When the texts by Chesterton are compared, here considering Ball and Brown and
the compositions by Shakespeare considering here Caesar and Macbeth are compared
with each other, it is found that a considerable gap in the usage of words and the
frequency with which they appear and are used by the respective authors and writer
(Fig. 5), it is seen that a constituent gap is maintained between the writing patterns of
various writers and the frequency with which they tend to use words in their texts and
compositions follows a certain kind of pattern.
Fig. 5. Comparison according to the frequency of occurrence of words with their corresponding
ranks in the text Caesar and Macbeth by Shakespeare and Ball and Brown by Chesterton
After studying the above results it is concluded that the writing pattern and the usage
of words differ from person to person and can help in some major result findings and
research. After seeing the result in Fig. 5, it is seen that how the compositions by two
people vary and the frequency of usage of words differs producing a considerable gap
in the graph.
518 D. Gupta et al.
These results can also help in understanding this kind of pattern which is being
followed in many cases and places like the population in largest cities in World, or
increase in revenue of an organization over a period of time, to check the progress of
students in a year etc., with the help of Zipf’s law results generation and its analyses
is done to see whether the result is achieved or not. It may also help in some real time
aspects around us and their results play a significant role like the progress of students
for a school is very important.
Similarly it is very important sometimes to judge the anonymity of a text or
composition which can be done through this measure and find the existence. though it
is also seen that exact Zipf’s law is not followed in here and hence can be said a
failure to Zipf’s law but the concept can be used to to determine and identify the
anonymity of not only texts but THIS can be used in various other aspects as in
Intrusion Detection as a future application, where a set pattern of Intrusion can be
detected and saved and if any deviation from the existing pattern is found, it can be
said that there might be an Intrusion in the organization.
References
1. Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python: Analyzing Text
with the Natural Language Toolkit. O’Reilly Media (2009)
2. Lobur, M., Romanyuk, A., Romanyshyn, M.: Using NLTK for Educational and Scientific
Purposes. In: 11th International Conference on The Experience of Designing and
Application of CAD Systems in Microelectronics, pp. 426–428 (2011)
3. Abney, S., Bird, S.: The Human Language Project: Building a Universal Corpus of the
World’s Languages. In: Proceedings of the 48th Annual Meeting of the Association for
Computational Linguistics, pp. 88–97 (2010)
4. Li, W.: Random Texts Exhibit Zipf’s-Law-Like Word Frequency Distribution. IEEE
Transactions on Information Theory 38(6), 1842–1845 (1992)
5. Rahman, A., Alam, H., Cheng, H., Llido, P., Tarnikova, Y., Kumar, A., Tjahjadi, T.,
Wilcox, C., Nakatsu, C., Hartono, R.: Fusion of Two Parsers for a Natural Language
Processing Toolkit. In: Proceedings of the Fifth International Conference on Information
Fusion, pp. 228–234 (2002)
6. Garrette, D., Klein, E.: An Extensible Toolkit for Computational Semantics. In: Proceedings
of the 8th International Conference on Computational Semantics, pp. 116–127 (2009)
7. Chen, Q., Zhang, J., Wang, Y.: The Zipf’s Law in the Revenue of Top 500 Chinese
Companies. In: 4th International Conference on Wireless Communications, Networking and
Mobile Computing, pp. 1–4 (2008)
8. Shan, G., Hui-xia, W., Jun, W.: Research and application of Web caching workload
characteristics model. In: 2nd IEEE International Conference on Information Management
and Engineering (ICIME), pp. 105–109 (2010)
9. Project Gutenberg Archive, https://archive.org/details/gutenberg
An Efficient Approach for Discovering
Closed Frequent Patterns in High Dimensional
Data Sets
1 Introduction
Progress in the bar-code as well as in the retail market and online shopping tech-
nology made it very simple for the retailer to collect and store massive amounts
of sales data. Such kind of transaction record typically consists of the date and
the items bought by the customers. In the recent year, many MNC retail organi-
zations view such databases as an important pieces for marketing strategies. The
researcher are keen to develop a information-driven marketing strategies with the
help of these databases, which helps marketers to develop and customized mar-
keting strategies [10]. In the latest example, Electronic health records (EHRs)
have effective and efficient power to develop the delivery of healthcare services,
importantly if it is implemented and used in smooth manner. It can be han-
dled by maintaining an exact problem definition and up-to-date data [1]. As the
technology grew advanced in the field of bioinformatics, e-commerce in various
gene expression data sets, transactional data bases have been produced. Data
sets of extremely high dimensionality pose challenges on efficient calculation to
various state-of-the-art algorithms for data mining process. Examples of such
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 519
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_60, c Springer International Publishing Switzerland 2014
520 B. Singh et al.
high dimensional data are microarray gene data, network traffic data, and shop-
ping transaction data. As this data have very large numbers of attributes (genes)
in compare to number of instances (rows). Such type of extremely high dimen-
sional data requires special and innovative data mining techniques to discover
interesting hidden patterns from it. For example, Classification and Clustering
[15] techniques have been applied on such kind of data, with slower processing
or lower accuracy or they are fail to handle data. The association rule mining
was proposed by P. Cheeseman et al. [9]. An example of association rule might
be 97% of patient that are suffering from dyspepsia and epigastric pain are
commonly followed by heartburn [1], [3].
Data mining is the science of finding hidden patterns and useful data from the
raw data, have been gathered from different sources. It is a combination of differ-
ent streams including machine learning, data warehousing, data collection etc.
It helps in extraction of interesting patterns or knowledge from huge amount of
data. There are many techniques used for this purpose: Association rules discov-
ery, Sequential Pattern Discovery, Cluster analysis, Outlier Detection, Classifier
Building, Data Cube/Data Warehouse Construction, and Visualization.
A set of items that appears in most of the transaction is said to be frequent.
If I=i1 , i2 ,......, ik be a set of items, then support for I is the total number of
transaction in which item I exists as a subset. The Mining of frequent items is
totally different concept to the similarity search. In similarity search, items that
have maximal number of matching in common, although the total number of
transaction may be very less. The large oddness leads to build up new type of
techniques for retrieving the highly frequent item-sets.
Most suitable areas for frequent pattern mining in high dimensional data is
Customer Relationship Management (CRM)[11], Web pages Searches and Anal-
ysis, Geographical Data Analysis, Drug Synthesis and disease prediction [1]. In
this paper, attributes, feature and column are indicating the same meaning. In
general, they are characteristics which are required to represent a particular
instance of the data sets.
2 Related Work
ARM worked on items ‘Ii’ that belong to one and more transactions ‘Ti’ [11]. In
the e-commerce place the items might be produced and sold online from a store,
and the transactions may be data of buying items made in that store. In this
characterization, each transaction is a set of items bought together. The standard
framework for association rule mining (Apriori) uses two measure that is support
and confidence as a constraint which reduces the searching problem [1], [11], [15].
Using this conception, an association rule, A and B is a pair of items (or sets of
items) that are allied to each other based on their frequency of co-occurrence in
transactional databases. Let us consider an example, a hypothetical association
rule might be insulin diabetes with support 450 and confidence 85%. The rule
indicates that 450 patients were suffered from insulin and had diabetes on their
problem list, and that 85% of patients with insulin on their list also had diabetes
An Efficient Approach for Discovering Closed Frequent Patterns 521
on their problem list [1]. This method can be used to discover disease-disease,
disease-finding and disease-drug co-occurrences in medical databases [7], [8].
Since last twenty years, Closed Frequent pattern mining [1], [2], [4], [5], [11]
emerged as a vast research area that concentrates a momentous quantity of
interest by the researchers. Due to the huge number of features the existing
techniques carried out various redundant frequent patterns as we have discussed
above. To condense the frequent patterns to a condensed size, mining frequent
closed patterns has been proposed. In literature, we have found the followings
advanced method to mining closed frequent patterns. The previous data mining
techniques could not tackle such data efficiently because of its high dimension-
ality. The column enumeration (feature) and row enumeration (instance) are
basically the type of frequent pattern mining algorithm [1], [4], [5], [6]. Close
[14] and Pascal [12] are the techniques which find out closed patterns by per-
forming breadth-first, column enumeration. An Apriori based technique Close
[14] was proposed to determine a closed item sets. The problem and limitation
of these two techniques is due to the level-wise approach, the number of fea-
ture sets enumerated will be extremely large when they are run on a datasets
having large number of feature, like biological datasets. CLOSET [3] was known
for depth first search column enumeration technique to find closed frequent pat-
terns. For compressed representation it uses FP tree data structure. But it was
unable to handle long micro array datasets because of FP’s inability of effective
compression of long rows and secondly too many possibilities of combinations in
column enumeration. Recently CLOSET+[6] was proposed as an enchantment
in the former one. Another algorithm CHARM worked similarly as CLOSET
but it stored data into vertical format.
Let us consider the bioinformatics data, which is having long Feature sets
and very few instances, the evaluation criteria deteriorate due to the long fea-
ture permutations. To tackle such kind of difficulties, a new approach based on
row enumeration, Carpenter [2], has been developed to handle and perform on
microarray data sets as an alternative. Carpenter techniques have their own lim-
itation, it consider a single enumeration only, which is unable to reduce the data
if we have a data set with high number of features and high number of rows. To
handle such problems new algorithm Cobbler [4] has been implemented which
can proficiently mine high number of features and high number of rows in the
datasets. Based on the nature of data properties Cobbler is designed to trans-
form data automatically between feature enumeration and row enumeration[4].
A new algorithm based on row enumeration tree method, TD-Close [13] is de-
signed and implemented for mining a complete set of frequent closed item sets
from high dimensional data.
The problem of mining highly frequent patterns has been the interest of
researcher in these recent days, as it produces a large number of redundant
pattern.s To reduce these redundant patterns, we will prune certain frequent
patterns that are less associated with each other. Therefore, we will manage
the whole frequent pattern sets with some reduced number of highly frequent
pattern sets and eliminate the redundancy.
522 B. Singh et al.
The previous data mining techniques could not tackle such data efficiently be-
cause of its high dimensionality. The column enumeration (feature) and row enu-
meration (instance) are basically the type of frequent pattern mining algorithm
[1], [4], [5], [6]. In column enumeration techniques like [5], [6], various combina-
tion of features are tested to discover the highly frequent patterns. Microarray
data have very small instances; these methods have proved more effective in
comparison to column enumeration-based algorithms.
In this paper, we have proposed a modified Carpenter algorithm with different
consideration of data structure, which in result gives us the better time complex-
ity in compare to simple implementation of Carpenter. Carpenter method was
designed to handle high dimensional data which have very large number of fea-
tures and less number of rows, i.e. n¡¡m. In the next section we will discuss some
of the basic, preliminaries and define our problem. After that we will give our
proposed approached for highly frequent pattern mining for high dimensional
data.
3 Preliminaries
We denote our data set as D=n×m. Let the set of all feature or column be F
= {f1 , f2 , f3 , ..., fm }. Our dataset also have n number of rows R={r1 , r2 , r3 ,
..., rn }, n m, where every row ri is a set of features from F, i.e., ri ⊆ F
which can be derived from our notation. For consideration, let see the Table 1,
the dataset has 8 features represented by set {f1 , f2 , f3 , f4 , f5 , f6 , f7 , f8 } and
there are 4 rows,{r1 , r2 , r3 , r4 }. A row have an feature fj if fi have a values
1 in the corresponding row ri . For example row r1 contains feature set {f1 , f4 ,
f5 , f7 , f8 } and row r2 contains feature {f3 , f5 , f6 } [2], [4]. We have been used
some already defined terms. Feature support set, as the maximal set of rows that
contains F’ while it is given F ⊆ F , denoted by R(F ) ⊆ R. Similarly, Row
Support Set, as the maximal set of features common to all the row in R’, while
it is given that R ⊂ R, denoted by F (R ) ⊂ F . As an example consider Table
1. Let F’={f1 , f5 , f8 } then R(F’)={r1 , r4 , r5 } since these are the number of
rows that have F’. Consequently, let R’={r1 , r5 } then F(R’)= {f1 , f5 , f8 } since
it is the maximal set of features common to both r1 and r5 [2], [4]. We have a
record of datasets D that are part of a features set. Our problem is to find all
high frequent patterns with respect to the user based support constraints. For
this example, we are assuming that the datasets follows the condition |R| |F |.
Carpenter method was designed to handle high dimensional data which is
having very large number of features and less number of rows, i.e. n¡¡m. But, we
have found by analysis that when the number of attributes become too large it
does not handle data effectively.
It depends on the use of data structure. Therefore, several versions of Carpen-
ter have been implemented and tested on benchmark datasets. The result shows
that, by using a different data structure we can make an efficient Carpenter that
performs better than simple implementation of Carpenter.
An Efficient Approach for Discovering Closed Frequent Patterns 523
The previous discussions clearly show that since the data has too many num-
bers of rows and too many numbers of columns so the speed of finding frequent
patterns will be slow with normal algorithms and without efficient pruning tech-
niques. For Example, enumeration on even mere 25 features will result in a set
of nearly 225 patterns, where each pattern will have to be checked to find their
number of occurrences to classify them as frequent or infrequent, so we can
imagine what will happen if the number of features are largeer than ( 10000,
in the range of thousands). These algorithms thus fail to satisfy our current
speed needs. Thus, our basic aim is to solve the problems coming in the way of
high dimensional highly frequent pattern [1] mining algorithms and try to give a
new algorithm for the same. For making and testing our implementation of high
dimensional association rule mining the following approach was used.
First select the dataset from HD domain for analysis and apply preprocessing
for the discretization and normalization such that we can remove noise. For
this, we used the tool mentioned on the next section. After applying, attributes
are increased, we determine appropriate parameters in coding for generating
algorithm. The data structures and their functionalities are explained in the
next section. Then checked it for the scope of better implementation then the
previous one. We have developed six versions for the improvement of our own
algorithms. After the implementation we checked the entire versions and came
to know the last version has less errors as well as better performance compare
with others.
For testing our implementation we choose pima indians diabetes (UCI 1 ) dataset
and also Ovarian Cancer Dataset2 . After the preprocessing3 step like normaliza-
tion and discretization, we tested the dataset. We used GCC java libraries 4.7.1
and CodeBlock IDE in Linux platform (4GB RAM core i5 3230) for developing
the code. Six versions of the algorithms have been developed each are different
1
http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes
2
http://www.biogps.org/dataset/tag/ovarian-cancer
3
UCS-KDD CARM Discretisation/ Normalisation (DN) software (Version 2 —
18/1/2005) From http://cgi.csc.liv.ac.uk/~ frans/KDD/Software/
524 B. Singh et al.
Fig. 1. An approach for Closed Frequent Pattern Mining with different version
in the sense of complexities and time consumption. Result of all the algorithms
are shown below for pima indians diabetes dataset. Version 1 correspond to Fig. 2
which shows the running time with respect to number of row using different min-
imum support parameter (MSP). In Fig. 2, Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7,
the x-axis corresponds to number of rows, and time consumption (sec.) is repre-
sented by y-axis. Fig. 3. Corresponds to version 2 in which we uses only arrays to
provide direct address based memory access instead of data value based access as
done in version 1 using STL maps. This version also improves on bugs that are
found in previous version. Extra arrays are used to minimize time spent in copying
and maintaining proper function calls. Transposed table and FCPs stored in 2D
global arrays. As our result shows it did not improve the overall performance of
the algorithm so we move towards next version of it. Version 3 uses only arrays to
provide direct address based memory access and removes the functions for finding
feature set and FCPs. Now we are using Map to store FCPs. For minimizing the
memory allocation and de-allocation time we define global arrays and again remov-
ing some bugs from the previous version. Fig. 4. Shows that there is improvement
in runtime from the previous version 1 and with the increase in number of rows the
runtime is changing rapidly. Arrays that are used for checking rows present in ev-
ery column in TT (Transposed Table) have been removed in version 4. The graph
shows slightly improved performance than the previous version with the increased
time. The next version Fig. 5 uses depth based 2D array approach to minimize time
wasted in memory allocation and de-allocation. Conditional Pointer Table will be
made global to avoid continuous memory allocation and copying between iterations
and function calls. It will surely increase the runtime with the number of rows in
the datasets. Our last version uses a helper table which sacrificed space complexi-
ties for speeding up the Pattern Mining by helping in the quick formation of FCPs.
This version also uses depth based 2D array for storing CPT also row positions cor-
responding to given column in transposed table were stored in a 2D array (-1 if row
An Efficient Approach for Discovering Closed Frequent Patterns 525
not present else count of position for particular column was used) which helped in
direct lookup during new CPT creation. Version 6 graph shows the same trends
as shown in the previous graphs but on increasing the number of columns in data
this easily defeats the runtime of previous versions. This is the one which has been
thoroughly checked for bugs since no further versions were to be maid and thus
we accept this as our final implementation. Runtime of this version also has been
evaluated and graph displayed in Fig. 8. To do the tests some rows were taken from
the 253 rows and 265493 columns of the Ovarian Cancer dataset. All columns were
taken for a given rows no column length criteria was taken. The graph shows that
on increasing the number of columns the runtime do not increase exponentially in-
stead just an almost linear relationship will be seen comparing with previous results
with less number of columns. The total number of columns used was much larger
than the previous data sets but the increase in time remains almost only linearly
dependent on the number of columns. Fig. 9 shows the time comparison graph of
all the versions. Version 1’s performance is very low and can be easily defeated by
others. Time complexity has been reduced by version 4 & 5. However the versions
have some bugs that later on removed in the version 6.
Fig. 2. CA Version 1
Fig. 3. CA Version 2
Fig. 4. CA Version 3
526 B. Singh et al.
Fig. 5. CA Version 4
Fig. 6. CA Version 5
Fig. 7. CA Version 6
The importance of High Dimensional Data mining Algorithms are very large in
the coming days. Thus very efficient implementations of algorithms are needed.
There is always scope for huge improvements in the algorithms being currently
researched and their current implementations by researchers and student groups.
An Efficient Approach for Discovering Closed Frequent Patterns 527
Testing should be also very rigorous to find the bugs on different kind of datasets.
This algorithm can also contribute in finding frequent closed patterns generation
in the case of High Dimensional Mining where number of rows is large. What we
can do is we can divide the dataset and in same proportion the MSP and find the
frequent patterns using these new criteria and then try to search among these
true patterns for entire data or guess the patterns based on some other criteria.
Actually we think this division method has the potential to solve the problem
of generation of large number of highly frequent patterns in the microarray
datasets usually in order of 106 for approximately 10000 attributes and the
problems related with memory requirements in case of Carpenter algorithm.
Many of these features although frequent are not interesting. Thus we need not
find all the frequent patterns instead remove them early in the pattern generation
period only. Some iterations to increase accuracy of patterns generated can be
done as in cross validation at the cost of time to yield better results in case of
baskets of data but we will talk about that some other time after our extensive
tests we are currently doing. The current implementation would be beneficial for
fast processing in the Biomedical Science, Biotechnology for producing results in
minimum number of time. Also Business Intelligence needs the quick generation
of frequent item sets in huge data.
References
1. Wright, A., McCoy, A., Henkin, S., Flaherty, M., Sittig, D.: Validation of an As-
sociation Rule Mining-Based Method to Infer Associations Between Medications
and Problems. Ppl. Clin. Inf. 4, 100–109 (2013)
2. Pan, F., Cong, G., Tung, A.K.H.: Carpenter: Finding closed patterns in long bi-
ological datasets. In: Proceedings of ACM-SIGKDD International Conference on
Knowledge Discovery and Data Mining, pp. 637–642 (2003)
3. Pei, J., Han, J., Mao, R.: CLOSET: An efficient algorithm for mining frequent
closed item sets. In: Proceedings of ACM-SIGMOD International Workshop Data
Mining and Knowledge Discovery, pp. 11–20 (2000)
4. Pan, F., Cong, G., Xin, X., Tung, A.K.H.: COBBLER: Combining Column and
Row Enu-meration for Closed Pattern Discovery. In: International Conference on
Scientific and Statistical Database Management, pp. 21–30 (2004)
5. Zaki, M., Hsiao, C.: Charm: An efficient algorithm for closed association rule min-
ing. In: Proceedings of SDM, pp. 457–473 (2002)
6. Wang, J., Han, J., Pei, J.: Closet+: Searching for the best strategies for mining
frequent closed item sets. In: Proceedings of 2003 ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (2003)
7. Chen, E.S., Hripcsak, G., Xu, H., Markatou, M., Friedma, C.: Automated Acqui-
sition of Disease: Drug Knowledge from Biomedical and Clinical Documents: An
Initial Study. J. Am. Med. Inform. Assoc., 87–98 (2008)
8. Sim, S., Gopalkrishnan, V., Zimek, A., Cong, G.: A survey on enhanced subspace
clustering. Data Mining Knowl. Disc. 26, 332–397 (2013)
9. Cheeseman, P.: Auto class: A Bayesian classification system. In: 5th International
Conference on Machine Learning. Morgan Kaufmann (1988)
10. Associates, D.S.: The new direct marketing. Business One Irwin, Illinois (1990)
528 B. Singh et al.
11. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceed-
ings of 1994 International Conference on Very Large Data Bases (VLDB 1994), pp.
487–499 (1994)
12. Bastide, Y., Taouil, R., Pasquier, N., Stumme, G., Lakhal, L.: Mining frequent
closed itemsets with counting inference. SIGKDD Explorations 2(2), 71–80 (2000)
13. Hongyan, L., Han, J.W.: Mining frequent Patterns from Very High Dimensional
Data: A Top-down Row Enumeration Approach. In: Proceedings of the Sixth SIAM
International Conference on Data Mining, pp. 20–22 (2006)
14. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed
itemsets for association rules. In: Proceedings of 7th International Conference on
Database Theory, pp. 398–416 (1999)
15. Kriegel, H.P., Kröger, P., Zimek, A.: Clustering high-dimensional data: A survey
on sub-space clustering, pattern-based clustering, and correlation clustering. ACM
Transactions on Knowledge Discovery from Data 3(1), 1–58 (2009)
Time-Fading Based High Utility Pattern Mining
from Uncertain Data Streams
Abstract. Recently, high utility pattern mining from data streams has
become a great challenge to the data mining researchers due to rapid
changes in technology. Data streams are continuous flow of data with
rapid rate and huge volumes. There are mainly three widow models
namely: Landmark window, sliding window and time-fading window used
over the data streams in different applications. In many applications
knowledge discovery from the data which is available in current window
is required to respond quickly. Next the Landmark window keeps the
information from the specific time point to the present time. Where as
in time-fading model information is also captured from the landmark
time to current time but it assigns the different weights to the different
batches or transactions. Time-fading model is mainly suitable for mining
the uncertain data which is generated by many sources like sensor data
streams and so on. In this paper, we have proposed an approach using
time-fading window model to mine high utility patterns from uncertain
data streams.
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 529
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_61, c Springer International Publishing Switzerland 2014
530 C. Manike and H. Om
Utility mining is the process of discovering patterns with utility more than the
user specified minimum utility threshold. Utility mining may also be considered
as an extension of frequent pattern mining [4]; however, it is more complex due
to the absence of anti-monotone [5] property and many utility calculations.
Several approaches have been proposed for mining high utility patterns from
static as well as dynamic databases to overcome the above barriers. Few algo-
rithms were also proposed and implemented using sliding window and landmark
window models [6], [4], [7]. Apart from these models time-fading window model
also has significant importance in some real-life applications. The Landmark win-
dow keeps the information from landmark time to the current time point where
in sliding window only recent information will be maintained and contains fixed
number of batches or transactions. Time-fading window also keeps the infor-
mation from the landmark time to current time point, but it gives the more
importance to the data which is arrived recently, that is old batch has got less
importance or weight than the recent batch [1], [2]. The main objective of
assigning different weights to batches is to avoid early pruning of low utility
itemsets, which may become high utility itemsets in future.
Besides data streams with precise data in which user has guarantee that an
item present in a transaction or not, but in other cases where data stream has
uncertain data [8], [9], in which user has no guarantee that an item presence or
absence in a transaction. For example, let us consider the retail market trans-
action data most of the products has high frequency on the period of the year
that is based on the season. But some other products may be sold at any time
period which cannot be predicted. Therefore knowledge discovery from the data
streams which contain the precise data may not be a big challenge than the
discovery from uncertain data. In this paper we have proposed an algorithm to
mine high utility patterns from uncertain data streams using time-fading win-
dow model. In our approach two data structures those are already used in our
previous works, one for mining landmark window (LHUI-Tree,) and another for
sliding window (SHUI-Tree).
Remaining part of this paper is organized as follows. In Section 2, prelimi-
naries are given and related work on high utility pattern mining over streams
discussed in Section 3. In Section, 4 our proposed algorithm is discussed briefly.
Experimental results are presented in Section 5, to show the performance of
proposed approach. Finally, conclusions and future enhancements are given in
Section 6.
2 Preliminaries
/ƚĞŵͬdŝĚ Ă ď Đ Ě Ğ
dϭ Ϯ Ϭ Ϭ ϰ ϭ
B1
dϮ Ϯ Ϯ Ϯ ϱ Ϭ /ƚĞŵ WƌŽĨŝƚ;ΨͿ
dϯ Ϭ ϯ ϭ ϯ Ϭ Ă ϱ
W1
dϰ ϯ Ϭ ϰ ϰ Ϭ ď ϯ
B2 dϱ ϰ Ϭ Ϭ ϲ ϭ Đ ϰ
dϲ Ϭ ϯ ϰ Ϭ Ϭ Ě ϭ
W2
dϳ Ϯ Ϭ ϭ ϰ Ϭ Ğ ϭϬ
B3 dϴ Ϭ Ϯ Ϭ ϯ Ϭ
dϵ Ϭ Ϯ Ϭ ϱ ϭ
(a) (b)
3 Related Work
Utility mining was introduced while discovering patterns and association rules
based on user interest [10]. The basic definitions and theoretical model of util-
ity mining was given in [3]. The first property which has similar strategy in
reducing search space like Apriori property was observed in [11]. Based on this
anti-monotone property several algorithms have been implemented to mine high
utility patterns from traditional databases. However, mining of these patterns
from evolving data streams becomes more significant, because the utility of a
pattern may be increased or decremented as the data arriving continuously in
the stream [12]. First approach called, THUI-Mine [13] which is proposed for
mining high utility patterns from data streams based on sliding window model.
This approach processes the transactions in a batch by batch fashion; first it
calculates each item transaction weighted utility. Items those are having the
transaction weighted utility [11] more the threshold are considered for further
process. This algorithm did not meet the essential requirements of data stream
because it requires more than one database scan and cannot be applied for
landmark and time fading window models.
To overcome the drawbacks in the first approach another two algorithms
MHUI-BIT, MHUI-TID were proposed [14] by representing data in the form of
Bitvecot(BIT) and Tidlist(TID). However, MHUI-TID and MHUI-BIT achieved
efficiency in terms of memory consumption and execution time over THUI-Mine
still need more than one database scan because of level-wise approach applied
for itmesets of length more than 2. Afterwards, few approaches were proposed
532 C. Manike and H. Om
using different window models, except GUIDE algorithm [4], none of them was
proposed any methods based on the time-fading window model. As we already
discussed earlier, number of sources those are generating stream of data and
their generation rate is going on increases. On the other hand mining frequent
patterns from the uncertain data streams is another important issue [1], [2].
In this paper we have proposed an approach to mine high utility patterns from
uncertain data streams using time-fading window model.
Root
(a) (b)
Root
d:13.87
(c)
tf u(X, tn ) = {Σi=1
n−1
tf u(X, Ti ) × α} + u(X, tn ) (1)
n
tf u(X, tn ) = Σi=1 (tf u(X, Ti ) × α(tn −ti ) ) (2)
Next consider sliding window tree structure (SHUI-Tree) to update all trans-
actions in time-fading window model. When a transaction or a batch of transac-
tions arrives in the data stream algorithm generates all possible itemset
corresponding utilities and update in tree structure. Unlike in landmark win-
dow data structure, it keeps the information of time of an itemset and its utility,
here time fading utility of an itemset is calculated using above equation only
when the user query encounters.
Root Root
d:14 e:20 e:14 b:16 c:18 d:29 e:20 c:14 d:11 d:13 e:14
14(1) 20(1) 14(1) 16(2) 18(2) 14(1),15(2) 20(1) 14(2) 11(2) 13(2) 14(1)
e:24
24(1) c:24 d:21 d:23 e:24 d:19
24(2) 21(2) 23(2) 24(1) 19(2)
d:29
29(2)
(a) (b)
10(1), 10(2), assume transaction number as time of occurrence. Here utility value
is not multiplied with decay factor α after processing each transaction. When
the window sliding happens, old patterns will be deleted and new patterns are
updated in the tree. When the user queries the system then time fading utility
of each pattern in the SHUI-Tree is calculated using above equation and results
are generated accordingly.
4 Experimental Results
350 160
LHUI−Tree LHUI−Tree
300 SHUI−Tree SHUI−Tree
140
250
120
200
150 100
100
80
50
0 60
50 100 150 200 50 100 150 200
Number of Batches (5000 transactions in each batch) Number of Batches (5000 transactions in each batch)
300 300
LHUI−Tree LHUI−Tree
SHUI−Tree 250 SHUI−Tree
200
200
150
150
100
100 50
50 100 150 200 50 100 150 200
Number of Batches (5000 transactions in each batch) Number of Batches(5000 transactions in each batch)
5 Conclusions
References
1. Gaber, M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: a review. ACM
Sigmod Record 34(2), 18–26 (2005)
2. Aggarwal, C.: Data streams: models and algorithms, vol. 31. Springer (2007)
3. Yao, H., Hamilton, H., Butz, C.: A foundational approach to mining itemset utili-
ties from databases. In: The 4th SIAM International Conference on Data Mining,
pp. 482–486 (2004)
4. Shie, B.E., Yu, P., Tseng, V.: Efficient algorithms for mining maximal high utility
itemsets from data streams with different models. Expert Systems with Applica-
tions 39(17), 12947–12960 (2012)
5. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc.
20th Int. Conf. Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)
6. Ahmed, C., Tanbeer, S., Jeong, B.S.: Efficient mining of high utility patterns over
data streams with a sliding window method, pp. 99–113 (2010)
7. Ahmed, C., Tanbeer, S., Jeong, B.S., Choi, H.J.: Interactive mining of high utility
patterns over data streams. Expert Systems with Applications 39(15), 11979–11991
(2012)
8. Leung, C.S., Jiang, F.: Frequent itemset mining of uncertain data streams using the
damped window model. In: Proceedings of the 2011 ACM Symposium on Applied
Computing, pp. 950–955. ACM (2011)
9. Aggarwal, C., Yu, P.: A survey of uncertain data algorithms and applications. IEEE
Transactions on Knowledge and Data Engineering 21(5), 609–623 (2009)
10. Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: Third IEEE
International Conference on Data Mining, ICDM 2003, pp. 19–26. IEEE (2003)
11. Liu, Y., Liao, W.K., Choudhary, A.: A two-phase algorithm for fast discovery of
high utility itemsets, pp. 689–695 (2005)
12. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.: Mining frequent patterns in data
streams at multiple time granularities. Next Generation Data Mining 212, 191–212
(2003)
13. Tseng, V., Chu, C.J., Liang, T.: Efficient mining of temporal high utility item-
sets from data streams. In: Second International Workshop on Utility-Based Data
Mining, p. 18. Citeseer (2006)
14. Li, H.F., Huang, H.Y., Chen, Y.C., Liu, Y.J., Lee, S.Y.: Fast and memory efficient
mining of high utility itemsets in data streams. In: Eighth IEEE International
Conference on (ICDM), pp. 881–886. IEEE (2008)
15. Leung, C.K.-S., Jiang, F.: Frequent pattern mining from time-fading streams of
uncertain data. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862,
pp. 252–264. Springer, Heidelberg (2011)
Classification for Multi-Relational Data Mining
Using Bayesian Belief Network
1 Introduction
Most of the real world data are well structured and stored in Relational database that
is making it a very important source of information. Many data mining algorithms for
classifications like Naïve Bayesian, Support Vector Machine, works on a single table.
Traditional algorithms cannot be applied until relational database transformed into a
single table.
Further, conversion of relational tables into a single table, semantic information is
missed. Deprivation of this data leads to inaccurate classification. The operation of con-
version is tedious in terms of times and space. It creates huge universal dataset when we
convert them to single flat table (relation). This paper discusses a Bayesian Belief Net-
work based approach for multi relational data mining that builds a probabilistic model
directly from multiple tables and also exploits relation between tables as well.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 537
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_62, © Springer International Publishing Switzerland 2014
538 N.D. Bharwad and M.M. Goswami
The residual of the Research paper is formed as follows: Next section gives a brief
inspection of the surviving work, section III gives a brief discussion of background
theory, section IV discusses proposed method with examples, and section V gives
conclusions and future works of research study.
2 Related Work
3 Background Theory
Fig. 2. Loan table and Account table relationship in PKDD cup99 dataset
540 N.D. Bharwad and M.M. Goswami
4 Proposed Approach
Fig. 3. Architecture of Classification for Multi-Relational Data Mining using Bayesian Belief
Network
Step 1: Pre-Processing.
As mentioned in background theory Semantic relationship graph is generated from
tables. Tuple id Propagation is used for virtual join between tables and propagate
class from target table to non-target tables.
Step 2: Attribute Selection.
Selection of attribute is used for eliminating the attributes which are not related to the
classification. Further the attribute values are also discretized to reduce the total num-
ber of possible values.
Classification for Multi-Relational Data Mining Using Bayesian Belief Network 541
Although, the CPT estimation is a time consuming process, but it is only a one
time process.
Ta is an independent (refer Table 2). So P (Ta) is given as,
A1 A2 A3 C1 C2
a10 a20 a30 Y N
a10 a20 a31 N Y
a10 a21 a30 N N
B1 B2 A1 A2 A2 A1 A2 A3
b10 b20 a10 a20 a30 a11 a12 a13
b10 b21 a10 a20 a31 a21 a22 a23
V7, …,V18 are different values and C1 and C2 are class labels. C1 and C2 are
class labels in Table 2 and Table 3.
References
1. Sašo, D., Lavrač, N.: An introduction to inductive logic programming. In: Relational Data
Mining, pp. 48–73 (2001)
2. Lavrac, N., Dzeroski, S.: Inductive Logic Programming: Techniques and Applications. El-
lis Horwood (1994)
3. Blockeel, H., Dehaspe, L., Demoen, B., Janssens, G., Ramon, J., Vandecasteele, H.: Im-
proving the Efficiency of Inductive Logic Programming through the Use of Query Packs.
J. Artificial Intelligence Research 16, 135–166 (2002)
4. Yin, X., Han, J., Yang, J., Yu, P.S.: CrossMine: Efficient classification across multiple da-
tabase relations. In: Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.) Constraint-Based
Mining. LNCS (LNAI), vol. 3848, pp. 172–195. Springer, Heidelberg (2006)
5. Muggleton, S.H.: Inverse Entailment and Progol. New Generation Computing 13(3-4),
245–286 (1995)
6. Muggleton, S., Feng, C.: Efficient Induction of Logic Programs. In: Proceedings of Confe-
rence on Algorithmic Learning Theory (1990)
7. Pompe, U., Kononenko, I.: Naive Bayesian classifier within ILP-R. In: Proceedings of the
5th International Workshop on Inductive Logic Programming, pp. 417–436 (1995)
Classification for Multi-Relational Data Mining Using Bayesian Belief Network 543
8. Heckerman, D.: Bayesian networks for data mining. Data Mining and Knowledge Discov-
ery 1(1), 79–119 (1997)
9. Ceci, M., Appice, A., Malerba, D.: Mr-SBC: A Multi-relational Naïve Bayes Classifier. In:
Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS
(LNAI), vol. 2838, pp. 95–106. Springer, Heidelberg (2003)
10. Flach, P., Lachiche, N.: 1BC: A first-order Bayesian classifier. In: Džeroski, S., Flach,
P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 92–103. Springer, Heidelberg (1999)
11. Neville, J., Jensen, D., Gallagher, B., Fairgrieve, R.: Simple Estimators for Relational
Bayesian Classifiers. In: International Conference on Data Mining (2003)
12. Manjunath, G., Murty, M.N., Sitaram, D.: Combining heterogeneous classifiers for rela-
tional databases. Pattern Recognition 46(1), 317–324 (2013)
13. Quinlan, J.R., Cameron-Jones, R.M.: FOIL: A Midterm Report. In: Proceedings of 1993
European Conference on Machine Learning (1993)
14. Yin, X., Han, J., Yang, J.: Efficient Multi-relational Classification by Tuple ID Propaga-
tion. In: Proceedings of the KDD-2003 Workshop on Multi-Relational Data Mining (2003)
Noise Elimination from Web Page Based on Regular
Expressions for Web Content Mining
Amit Dutta1, Sudipta Paria2, Tanmoy Golui2, and Dipak Kumar Kole2
1
Department of IT, St. Thomas’ College of Engineering & Technology, Kolkata
[email protected]
2
Department of CSE, St. Thomas’ College of Engineering & Technology, Kolkata
{pariasudipta,tanmoy.stcet,dipak.kole}@gmail.com
1 Introduction
Nowadays the rapid expansion of the Internet has made the World Wide Web to be a
popular place for broadcasting and collecting information. Thus web mining becomes
an important task for discovering useful knowledge or information from the Web [1].
Web mining can be classified as: web structure mining, web content mining and web
usage mining [2]. It is very common that useful information on the web is often
accompanied by a large amount of noise such as banner advertisements, navigation
bars, copyright notices, etc. Although such information items are functionally useful
for human viewers and necessary for the web site owners, they often hamper
automated information gathering and web data mining, e.g., web page clustering,
classification, information retrieval and information extraction.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 545
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_63, © Springer International Publishing Switzerland 2014
546 A. Dutta et al.
In the field of web content mining some significant work had been done in the past
years. In [3], a GA based model is proposed for mining frequent pattern in web
content. In [4][5], web usage mining method proposed for personalization and
business intelligence for e-commerce websites. In [6], a method is proposed for
coherent keyphrase extraction using web content mining.Noise removal from
web pages is an important task which helps to extract the actual content for web
content mining in later phase. In [7], a method is proposed to detect informative
blocks in news Web pages. In [8], Web page cleaning is defined as a frequent
template detection problem in which partitioning is done based on number of
hyperlinks that an HTML element has. Web page cleaning is also related to feature
selection in traditional machine learning[9]. In feature selection, features are
individual words or attributes. A method is proposed for learning mechanisms to
recognize banner ads, redundant and irrelevant links of web pages [10]. The HITS
algorithm[11]is enhanced by using the entropy of anchor text to evaluate the
importance of links. It focuses on improving HITS algorithm to find more informative
structures in Web sites.
In this work, noisy contents are removed partially using tag based filtering method
based on regular expressions[12] and then an entropy based measure is incorporated
to determine remaining noisy contents in the web page and eliminate them.In this
paper, Section 2,3,4 and 5 represent preliminaries, proposed method along with the
necessary algorithms, experimental results and conclusion respectively.
2 Preliminaries
Noise: Information blocks except actual or main content blocks are referred as noise.
Noises can be classified into two types:Local Noise or Intra-page Noise (noisy
information blocks within a single web page, Ex: Banner advertisements, navigational
guides, decoration pictures etc.) and Global Noise or Inter-page Noise (noises on the
web which are usually no smaller than individual pages, Ex: Mirror sites, legal/illegal
duplicated Web pages, old versioned Web pages etc.)
<BODY bgcolor=RED>
<p> ….. </p>
<div>
<TABLE width=400
height=400>
….
</TABLE>
<imgsrc=”myimage.jpg”
width=100 height=100 >
<TABLE width=200 height=200>
….
</TABLE>
</div>
<p> ….. </p>
</BODY>
Fig. 1. A DOM Tree corresponding to the HTML Code
Site Style Tree (SST)[13]: Site Style Tree (SST) is a data structure which is created
by mapping the nodes of individual DOM trees of each web page of the website. SST
consists of two types of nodes: (i) Style Node and (ii) Element Node
Style Node- A style node S in the Style tree represents a layout or presentation
style, which has two components, denoted by (Es, n), where,Es is a sequence of
element nodes and n is the number of pages that has this particular style at this node
level.
Element Node - An element node E has three components, denoted by (TAG, Attr,
Ss), where, TAG is the tag name, Attris the set of display attributes of TAG and Ss is a
set of style nodes below E.
Example of an SST is given in Fig. 2 as a combination of DOM trees d1 and d2.
A count is used to indicate how many pages have a particular style at a particular
level of the SST. Both d1 and d2 start with BODY and the next level below BODY is
P-DIV-P. Thus both BODY and P-DIV-P has a count of 2. Below the DIV tag, d1 and
d2 diverge i.e. two different presentation styles are present. The two style nodes
(represented in dashed rectangle) are DIV-TABLE-TABLE and DIV-A-IMG
havingpage count equal to 1.
So, SST is a compressed representation of the two DOM trees. It enables us to see
which parts of the DOM trees are common and which parts are different.
To determine an element node in SST to be noisy an entropy based measure is
usedwhich evaluates the combined importance i.e. both presentation and content
importance. Higher the combined importance of an element node is, more likely it is
the main content of the web pages.
548 A. Dutta et al.
Node Importance
Definition: For an elemeent node E in the SST, let mbe the number of paages
containing E and l be the number
n of child style nodes of E (i.e., l = |E.Ss|), the node
importance of E, denoted by NodeImp(E), is defined by,
log , 1
( )= (1)
1 , =1
hat a Web page uses the ithstyle node in E.Ss.
where is the probability th
Considering only the nod
de importance, an element cannot be saidto be a noisy ittem
because it does not considder the importance of its descendants. Hence, compoosite
importance is considered to measure the importance of an element node andd its
descendants.
Composite Importance:
Definition (internal node)): For an internal element node E in the SST, let l = |E..Ss|.
The composite importance of E, denoted by CompImp(E), is defined by,
( ) = (1 ) ( ) ∑ ( ( ( )) (2)
Noise Elimination from Web Page Based on Regular Expressions 549
where, is the probability that E has the ithchild style node in E.Ss. In the above
equation, CompImp( ) is the composite importance of a style node (∈E.Ss), which
is defined by,
∑ ( ) (3)
( )=
where,Ejis an element node in .E, and k = | Es|, which is the number of element
nodes in .
In (2), is the attenuating factor, which is set to 0.9. is directly proportional to the
weight of NodeImp(E) i.e. it increases the weight of NodeImp(E) when lis large and it
decreases the weight of NodeImp(E)when l is small.
Definition (leaf node): For a leaf element node E in the SST, let l be the number of
features (i.e., words, image files, link references etc.) appeared in E and let mbe the
number of pages containing E, the composite importance of E is defined by,
1 , =1
( )= ∑ ( )
(4)
1 , 1
Definition (noisy): For an element node E in the SST, if all of its descendants and
itself have composite importance less than a specified threshold t, then theE is said to
be noisy.
Definition (maximal noisy element node): If a noisy element node E in the SST is
not a descendent of any other noisy element node, then E is called maximal noisy
element node.
Definition (meaningful): If an element node E in the SST does not contain any noisy
descendent, then E is said to be meaningful.
Definition (maximal meaningful element node): If a meaningful element node E is
not a descendent of any other meaningful element node, thenE is said to be a maximal
meaningful element node.
3 Proposed Method
Initiallya filtering method is appliedbased on regex to eliminate contents enclosed by
negative tags. But filtering does not ensure removing all the noisy information
present. So both the layouts and the contents of web pages in a given web site need to
550 A. Dutta et al.
be analyzed. Site Style Tree concept is used for this purpose which is created by
combining DOM Trees of the web pages in the website. After creating the SST,an
entropybased measure is incorporated for evaluating the node importance in the SST
for noise detection.The steps involved in our proposed technique can be represented
by the following block diagram (refer Fig.3. ).
3.1 Filtering
Depending on the content ofHTML tags in a web page the tags can be divided into
two types: a) positive tag and b) negative tag[14]. Positive tag contains useful
information. All the tags except positive tags are referred to as Negative tags. In this
work some tags have been defined as negative tags to remove noisy data from a web
page. The following tags are considered as negative tag:Anchor tag (<a>), Style tag
(<style>), Link tag (<link>), Script tag (<script>), Comment tag (<!-- … -->) and No
Script tag (<noscript>)
obtained then the hyperlink reference contains noisy elements like banner
advertisements of other website (local noise) or mirror sites, duplicate web page etc.
(global noise).
12.2 MapSST(E, )/*E and E are the element node of SST & PST
respectively */
Step 13:end for
4 Experimental Results
The above algorithm has been implemented using Java, in 32-bit Windows XP
Professional with Service Pack 2. The web browser used for the purpose is Google
Chrome Version 32. The processor used is Core 2 Duo, 3.06GHz and 2.00GB RAM.
For the purpose of experiment ten popular commercial websites have been taken.
As these websites are dynamic in nature so its contents are varied from time to time.
The snapshots of these websites are taken on 05.01.2014. In the initial phase a
comparative analysis is done (Table 1) on how much percentage of noisy element
(number of lines enclosed by negative tag) are there in those site.
Noise Elimination from Web Page Based on Regular Expressions 553
In the second phase, elimination of further noise (that is not removed by filtering)
is done by SST. In our experiment the threshold value (t) is initialized to 0.2. The
following table shows the total percentage of noise that can be removed by the
combined method of filtering and SST. The last column of the Table 2 shows the
amount of content present on those websites, which shows a reality of the experiment.
5 Conclusion
In this paper the technique to detect and remove local noisy elements from web pages
has been discussed. The proposed technique incorporates two phases: filtering based
on regular expressions and an entropy based measure in SST. From the experimental
result it is evident that the filtering method eliminates a considerable amount of noisy
(9thcolumn of Table 1) element before introduction of SST. So number of element
nodes in SST will be less compared to the technique where only SST methodis
involved for noise removal [13]. As the number of nodes in SST gets reduced, the
space requirement for storing nodes becomes less as well as the traversal time for tree
operations like insertion, deletion, searching etc. is reduced. Thus, overall spaceand
time complexity will be reduced. Furthermore the performance of web content mining
will also be improved because actual contents of the web pages can be extracted
easily after noisy elements removal. Experimental resultsshow that our proposed
method is highly effective. The proposed method ensures partial removal of the global
noises present in web pages. Hence, the proposed method can be extended further to
detect and remove all the global noises from web pages.
References
1. Han, J., Chang, K.C.-C.: Data Mining for Web Intelligence. IEEE Computer 35(11), 64–70
(2002)
2. Srivastava, J., Desikan, P., Kumar, V.: Web Mining - Concepts, Applications, and
Research Directions. In: Chu, W., Lin, T.Y. (eds.) Foundations and Advances in Data
Mining. STUDFUZZ, vol. 180, pp. 275–307. Springer, Heidelberg (2005)
3. Sabnis, V., Thakur, R.S.: Department of Computer Applications, MANIT, Bhopal, India,
GA Based Model for Web Content Mining. IJCSI International Journal of Computer
Science Issues 10(2), 3 (2013)
4. Eirinaki, M., Vazirgiannis, M.: Web mining for web personalization. ACM Transactions
on Internet Technology 3(1), 1–27 (2003)
5. Abraham, A.: Business Intelligence from Web Usage Mining. Journal of Information &
Knowledge Management 2(4), 375–390 (2003)
6. Turney, P.: Coherent Keyphrase Extraction via Web Mining. In: Proceedings of the
Eighteenth International Joint Conference on Artificial Intelligence, pp. 434–439 (2003)
7. Lin, S.-H., Ho, J.-M.: Discovering Informative Content Blocks from Web Documents. In:
Proceedings of Eighth ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, pp. 588–593 (2002)
8. Bar-Yossef, Z., Rajagopalan, S.: Template Detection via Data Mining and its Applications.
In: Proceedings of the 11th International Conference on World Wide Web (2002)
554 A. Dutta et al.
9. Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization.
In: International Conference on Machine Learning (1997)
10. Kushmerick, N.: Learning to remove Internet advertisements. In: Proceedings of Third
Annual Conference on Autonomous Agents, pp. 175–181 (1999)
11. Kao, J.Y., Lin, S.H., Ho, J.M., Chen, M.S.: Entropy-based link analysis for mining web
informative structures. In: Proceedings of Eleventh International Conference on
Information and Knowledge Management, pp. 574–581 (2002)
12. Fried, J.: Mastering regular expressions. O’Reilly Media Inc. (2006)
13. Lan, Y., Bing, L., Xiaoli, L.: Eliminating Noisy Information in Web Pages for Data
Mining. In: Proceedings of Ninth ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, KDD 2003, pp. 296–305 (2003)
14. Kang, B.H., Kim, Y.S.: Noise Elimination from the Web Documents by using URL paths
and Information Redundancy (2006)
15. Cormen, T.H., Leiserson, C.E., Ronald, R.L., Clifford, S.: Introduction to Algorithm. The
MIT Press (2009)
Modified Literature Based Approach to Identify
Learning Styles in Adaptive E-Learning
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 555
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_64, c Springer International Publishing Switzerland 2014
556 S.V. Kolekar, R.M. Pai, and M.M. Manohara Pai
be captured using this approach. Support of learning style model is making this ap-
proach efficient and accurate.
Several learning style models have been proposed by Myers-Briggs, Kolb and Felder-
Silverman out of which Felder-Silverman Learning Style Model (FSLSM) focuses specif-
ically on aspects of the learning styles of engineering students and it is successfully been
proved in questionnaire approach and as a behavioral model in literature based approach.
Learning styles represent a combination of cognitive, effective and psychological char-
acteristics that indicate the learner’s way of processing, grasping, understanding and per-
ception capabilities towards the course contents.
The paper discusses about the Prototype version of e-learning portal which has been
designed and developed to test the proposed framework of e-learning for identifying
learning styles of learners. It is based on the different types of contents and nature of
page visits in order to modify the literature based approach.
2 Related Works
Grapf et al. [2] discussed about Literature based approach where authors have analyzed
the behaviors of 127 learners for object oriented modeling course in Learning Manage-
ment System (LMS) moodle using Felder-Silverman Learning Style Model. Behavior
patterns associated with thresholds are determined according to learner’s activities. In
this paper, authors have done the experimentation on moodle which is a static web por-
tal that supports only static user interface for all learners.
Literature based approach of detecting learning styles in e-learning portal is men-
tioned by P.Q.Dung et al. [3]. Literature based approach is modified with multi-agent
architecture where learning styles of the learners have been identified by conducting
tests and storing it in a database. Later different types of contents are provided to learn-
ers and monitored as to identify whether the learning styles are changing as per the
behavior or not. The mentioned approach is time saving but static in nature where the
students learning styles are changing depending on structure of the course and learner’s
interest. Also the learning material is not divided into content types based on FSLSM.
Salehi et al. [4] have addressed the recommendation of course material for e-learning
system. Authors have mentioned the content based recommendation and similarity
based recommendation. However the learner’s profiles have not been classified based
on learning styles which is the main factor of recommendation with respect to learner’s
individual preferences.
The Web-based Education System with Learning Style Adaptation (WELSA) de-
scribed by Popescu et al. [5] which is an intelligent e-learning system. It adapts the
course contents to the learning preferences of each learner. Authors have considered
different learning objects as a Resource Type (e.g. Exercise, Simulation, diagrams and
experiment etc.) and not focused on different types of learning contents where learners
are trying to get knowledge by understanding and processing the contents with support-
ive resources.
Liu et al. [6] have described combined mining approach of web server logs and web
contents. The approach is useful to facilitate better adaptive presentation and navigation
as a part of Adaptive E-learning system. The mentioned novel approach of combination
is useful to find out the user navigation patterns and page visiting sequences.
Modified Literature Based Approach to Identify Learning Styles 557
Evolution and new trends of learning systems are reviewed in the context of learning
paradigms and methodologies by Simi et al. in [7]. Authors have given overview of
personalization as a learning process which is closely related to recommendation and
leads to the adaptation in contents on e-learning portal based learner’s preferences.
3 Methodology
The main objective of proposed work is to identify the learning styles using web us-
age mining in literature based approach. A work is carried out by capturing the access
patterns of the learners in W3C extended log formats and in database. The W3C log
files give the usage of different pages accessed as per the learners login and page visit
sequence. The database log gives the usage of different files accessed as per the course
contents and time spent on that page.
The model adopted is the one suggested by Felder and Silverman (1988) for engineer-
ing education, which classifies students according to their position in several scales that
evaluate how they perceive and process information [8], [9]. The initial levels of pa-
rameters considered according to the FSLSM are shown in Table 1. The parameters for
specific learner can be captured in web logs of E-learning Portal. The captured logs can
be further analyzed to modify literature based approach. Based on analysis the learners
can be classified into FSLSM categories.
The clustered learner profiles with learning patterns are the input for user classifi-
cation manager. Here classification algorithm can be used to identify different kind of
learners based on the learning style of Felder and Silverman Model. After identifying
the categories of learners the Interface component manager is changing the graphical
representation of user interface based on small applications of e-learning portal. The
contents of e-learning should be adaptive enough to change the interface components
according to the learning style. This adaptive interface components can be generated
using adaptive contents based on user classification with the help of administrative ac-
tivities and e-learning content repository. This content is an important input factor for
Interface Component Manager (ICM).
The paper describes about the approach of learning styles acquisition through usage
log data of e-learning portal and the detailed analysis of log data.
4 Experimentation Details
According to the framework mentioned in Fig. 1, implementation of e-learning portal
has been started using Microsoft Visual Studio 2008 and Microsoft SQL server 2005.
The portal is deployed on IIS server to provide access to all learners in the Intranet. The
log file option of IIS server is set to W3C extended log file format for the portal which
will capture usage details of learners who are accessing the portal.
The prototype version of portal is made accessible for First year engineering students
for Problem Solving Using Computers subject with the following topics: Arrays, Strings
and Pointers.Each topic is made available for study in different file formats such as text
(doc/pdf), video(mpeg/mp4) and demo (ppt/pptx). Also an exercise on 1D array, 2D
array and Strings for learners is generated. Around 75 learners have registered in portal
out of which 30 learners have accessed the topics which are mentioned above. The log
Modified Literature Based Approach to Identify Learning Styles 559
files of learners who are accessing portal are maintained in W3C log file along with
the time spent and access details of files are captured in database against the specific
session of the learner.
5 Experimentation Design
Algorithm 2. Total time spent on specific file by user in all the sessions
INPUT: A finite set of Learners L = L1 , L2 ,....,LN and Sessions S =S1 ,S2 ,....,SQ
OUTPUT: TFile j = Time spent on specific file in one session and Total Durationi = Total time spent on a specific file in all sessions by Learner
Li
initialize EndTime ← 0, StartTime ← 0, TotalDurationi ← 0, TotalSessionTime ← 0
i
for each Learner Li where i ← 1 to N do
for each Session S j where j ← 1 to Q do
if “File ” is accessed then
StartTime ← t
{t is the system time at Learner clicked}
end if
if Li is clicked back button || Li clicked other link || Li idle f or threshold time then
EndTime ← t
{t is the system time at Learner unclicked}
end if
T File j ← EndTime − StartTime
end for
TotalDurationi ← TotalDurationi + T File j
end for
The analysis has been done on IIS log records and Database and result has been
discussed through different graphs in next section.
Modified Literature Based Approach to Identify Learning Styles 561
1. Analysis done from database: Total time spent on specific file by learners in all
their sessions - Fig. 2 shows that the total time spent on specific file by learners in
their all sessions. Learners who have been accessed portal are spent time on files in
different sessions. The graph is showing the result of most frequently accessed files
by learners in specified duration.
2. Snapshot of report at instructor side: Number of times a specific type of file
accessed by learners - Fig. 3 shows the report which an instructor can generate
to get learner wise count of accessing different types of files. As per implemen-
tation, the portal is supporting only for PDF, PPT and Video files. Depending on
frequency of accessing specific types of file, learners interest in specific material
can be identified.
3. Analysis done from database: Number of times learners accessed different
topics - Fig. 4 shows the number of times learners accessed the topics with the
material. This analysis will give the interest in specific topic and requirement of
providing good material on different topics. The analysis can be further captured
562 S.V. Kolekar, R.M. Pai, and M.M. Manohara Pai
into accessing the topics as per learner’s requirement, e.g., how many learners ac-
cessed previous concept or advanced concept topic after accessing main topic.
4. Analysis done from W3C log files: Number of times learners accessed different
pages on portal - Fig. 5 shows that learners have not only accessed TopicSearch to
access different files but also accessed other pages such as exercise and announce-
ment pages. This analysis is important to understand the behavior of learner’s on
portal.
7 Conclusion
Online courses learning cannot be complete without incorporating the learning styles
of the learners in e-learning portal which leads to adaptive e-learning approach. In this
context, one challenge is to develop a portal which will be able to the identify learning
styles of learners and change the user interface components as per learner’s requirement.
In the proposed work, the prototype version of e-learning portal has been developed
to capture the usage data of learners through log files and database. The initial level of
analysis with respect to time spent on different types of files and number of times files
accessed is done.
Our method of capturing the learning styles comprises not only IIS log files but also
database entries where important factors of learning styles are captured. The usage data
is useful to identify the learning styles of the learners and classify the according to
FSLSM four dimensions such as Preprocessing, Perception, Input and Understanding
and eight categories of mentioned dimensions like Active/Reflective, Sensing/Intuitive,
Visual/Verbal and Sequential/Global.
References
1. Popescu, E., Trigano, P., Badica, C.: Relations between learning style and learner behaviour
in as educational hypermedia system: An exploratory study. In: Proceedings of 8th IEEE
International Conference on Advanced Learning Technologies, ICALT 2008 (2008)
564 S.V. Kolekar, R.M. Pai, and M.M. Manohara Pai
2. Grapf, S., Kishnuk, L.T.C.: Identifying learning styles in learning management system by
using indications from student’s behaviour. In: Proceedings of the 8th IEEE International
Conference on Advanced Learning Technologies, ICALT 2008 (2008)
3. Dung, P.Q. and Florea, A.M.: A literature-based method to automatically detect learning
styles in learning management systems. In: Proceedings of the 2nd International Conference
on Web Intelligence, Mining and Semantics (2012)
4. Salehi, M., Kmalabadi, I.N.: A hybrid attribute-based recommender system for e-learning
material recommendation. In: International Conference on Future Computer Supported Ed-
ucation (2012)
5. Popescu, E., Badina, C., Moraret, L.: Accomodating learning styles in an adaptive educa-
tional system. Informatica 34 (2010)
6. Liu, H., Keselj, V.: Combined mining of web server logs and web contents for classifying
user navigation patterns and predicting user’s future requests. In: Data and Knowledge Engi-
neering, Elsevier (2007)
7. Simi, V. and Vojinovi, O. and Milentijevia, I.: E-learning: Let’s look around. In: Scientific
Publications of the State University of Novi Pazar Series A: Applied Mathematics, Informat-
ics and Mechanics (2011)
8. Felder, R.M., Silverman, L.K.: Learning and teaching styles in engineering education 78(7),
674–681 (2011)
9. Felder, R.M., Spurlin, J.: Applications, reliability and validity of the index of learning styles.
International Journal of Engineering 21, 103–112 (2005)
10. Son, W.M., Kwek, S.W., Liu, F.: Implicit detection of learning styles. In: Proceedings of the
9th International CDIO Conference (2013)
11. Abraham, G., Balasubramanian, V., Saravanaguru, R.K.: Adaptive e-learning environment
using learning style recognition. International Journal of Evaluation and Research in Educa-
tion, IJERE (2013)
A Concept Map Approach to Supporting Diagnostic
and Remedial Learning Activities
1 Introduction
In the recent years there has been tremendous advancement in the field of computer
networks and communication technology. This has led to a lot of progress in the field
of e-Learning. One of the particular areas of e-learning that has attracted a lot of
researchers is ITS. An ITS is a complete system that aims to provide immediate and
customized instruction and feedback to the learners without the intervention of human
tutor [12]. A significant contribution in this area has been the work of Johnson [10] in
which he proposed a authoring environment for building ITS for technical training for
IBM-AT class of computers. Vasandani [11] in his work built an ITS to organize
system knowledge and operational information to enhance operator performance.
There has been a lot of work in ITS in Mobile Learning (M-Learning) environment as
well. In 2005, Virvouet al.[7] implemented a mobile authoring tool which he called
Mobile Author. Once the tutoring system is created it can be used by the learners to
access learning objects as well as the tests. Around the same time Kazi [8] proposed
Voca Test which is an ITS for vocabulary learning using M-Learning approach.
However as identified by Chen-Hsiun Lee et al. [6], evaluations conducted via online
tests in a ITS do not provide a complete picture of student’s learning as they show
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 565
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_65, © Springer International Publishing Switzerland 2014
566 A. Acharya and D. Sinha
only test scores. They do not help the learner identify the exact area where he is
deficient. Thus in this work we propose an Intelligent Diagnostic and Remedial
Learning System (IDRLS) which will help the learner identify the concepts he is
deficient in and what are the related concepts he should revise. Notable examples in
this area are the work of Hwang [3] where he has used a fuzzy output to provide
learning guidance. [6] has used the Apriori algorithm to generate concept maps
which has been used to generate learning guidance.
The theoretical framework for this study refers to the theory of Meaningful
Learning proposed by David Ausubel [2] in 1963. If a person attempts to learn by
creating a connection to something he already knows then he experiences meaningful
learning. On the other hand, if a person attempts to learn by memorizing some
information, he attempts rote learning. Thus in meaningful learning, he is able to
relate the new knowledge to the relevant concept he already knows. Ausubel thus
advocates the concept of reception learning where new concepts are obtained by
adding questions and clarifications to the old concepts. For this, he advocates two
methods: signaling which indicates important information and advanced organizers
which indicates the sequence between these. These psychological foundation led to
the development of Concept Maps in Cornell University by Joseph D Novak in 1972
[1]. In brief, let C1 and C2 be two concepts. If learning the concept C1 is a
prerequisite to learning concept C2 then the relationship C1 → C2 exists [4].
Rounding up, we construct a mapping between Ausubel’s learning elements [7,9] to
the modules that may be used for implementation of IDRLS. This mapping is
displayed in Table 1.
Table 1. Relationship between Ausubel’s Learning elements and our proposed implementation
The organization of the paper is as follows. The next section discusses the
architecture of proposed IDRLS in details. We then implement this architecture in M-
Learning environment using Android Emulator [5]. Finally, we conclude by
discussing a set of experiments which validate that Diagnostic and Remedial learning
has indeed been useful to the students.
A Concept Map Approach to Supporting Diagnostic and Remedial Learning Activities 567
2 System Overview
Fig. 1shows the architecture of the learning system. The learning system resides on
My SQL database which is used as a web server. It supports activities like storing
learning objects, examination quizzes, test questions etc. The learning system uses
Oracle database management system for storing the student model as shown in Fig. 2.
The learning system may be accessed from a Computer or a mobile/hand held device.
This could be an Android based device.
As indicated in Table 1, the architecture consist of three components: (i) A Hash
based algorithm module to generate concept maps (ii) A module to store and access
the learning objects and all data relating to learning (iii) A module to conduct tests
and store marks.
StudentLearning Interface
PC Mobile/PDA
Internet
Inference
Engine
Study Learning Test Student Student
Materials Portfolio Result General Reading
Data Behavior
Knowledg
e Base
Oracle Database
For the first module the inputs required are Test Item Relation Table (TIRT) and
Answer Sheet Summary table(ASST) as indicated by [4]. TIRT stores the degree of
association between test item Qi and concept Cj. ASST stores the student’s answers
for a test. Direct Hashing and Pruning (DHP) algorithm may be applied on these data
sets to generate a set of association rules between the concepts along with their
weights. Removing the redundant association rules the final concept map is generated.
ASST and TIRT is stored in the Knowledge base as shown in Fig. 1. The inference
engine generates the Concept Map and stores these in the Knowledge base. These are
called Learning Portfolio in totality. The main functionality of the second module is
568 A. Acharya and D. Sinha
to store the study materials corresponding to each concept. It will also store the
student reading behavior. The reading behavior gives an indication of the documents
accessed corresponding to each concept and the period of study. It will also store the
general student data. This contains basic information relating to the students like
enrollment number, their previous academic record, the marks secured in the pretest
etc. Classification algorithms may be applied on these data to predict the subset of
these who may fail in post test [13]. IDRLS is intended specifically for them. The aim
of the third module is to conduct tests and store marks. Thus a question bank is
necessary which will store questions, answer choices, degree of difficulty, priority of
selection, solution of the question. Test results will store the enrollment number along
with the concept name and the marks secured in the examination of the concept. The
modules along with data sores are shown in the schematic diagram in Fig. 2.
The concept names may be simplified as shown in the Table 2. Due to paucity of
space we show some of the records only. The relationship between the concepts are
shown in Table 3.
Loops
Branches
Arrays Functions
Exception Inheritance
Handling
File Handling
We next store the learner reading behavior. We assume that each concept can be
studied from two documents for the sake of simplicity. The learner is deemed to have
studied a concept if he has accessed the two documents. The date of access of each
document and duration of study of each is also stored (Table 4).
From Table 4 and Table5 the student learning model is constructed (Table 6). This
will store average marks and the number of documents accessed corresponding to
each concept. A student is deemed to have learnt a concept very well if he has secured
an average mark of more than 75% in that concept and accessed all the documents.
Similarly a student is deemed to have learnt a concept moderately well if he has
secured an average marks of more than 60% in that concept and accessed all the
documents. Finally a student is deemed to have learnt a concept poorly if he has
secured an average mark of less than 60% in that concept. These student needs
remedial learning for these concept.
As an example, suppose that a learner has performed poorly in the concept C4.
From Table 2 Table 3 it can be immediately deduced that the prerequisite to this
A Concept Map Approach to Supporting Diagnostic and Remedial Learning Activities 571
concept are concepts C2 and C3. Thus learner is sent an SMS advising him to revise
these concepts. A typical table storing these diagnostic SMS is shown in Table 7.
The proposed work has been implemented in a mobile environment using Android
Emulator. At the start the learner log into the system using his Email-id and password
Fig. 4. A typical test question for the concept ‘Classes and Objects’
The student then learns the concepts in the pre defined sequence. After learning the
concepts document wise the students are to appear in a test. The tests are multiple
choice types. A question for a test corresponding to module C9 is shown in Fig.4.
These tests are evaluated and the corresponding diagnostic message is SMS to the
students.
572 A. Acharya and D. Sinha
4 Experiments
An experiment and a survey were conducted to evaluate the effects IDRLS had on the
learners. The course ‘Introduction to Java Programming’ was offered to 60
undergraduate students majoring in computer science under University of Calcutta.
The students were given to study in a conventional manner. A pretest was conducted
at the end of this learning process. It was found that 18 students failed to pass in this
exam. We call them weak students. These students then used IDRLS for remedial
learning. At the end of this study, a post test was conducted on both cluster of
students (strong and weak) to evaluate the effect IDRLS had on weak students. A
survey was also conducted to determine the impact of remediation mechanism offered
by IDRLS on these students. The entire scheme is shown in Fig. 5.
For evaluating the impact IDRLS had on the weak students, a paired sample t-test
was conducted on both these clusters after the pretest and post test. The findings are
shown in the Table 8 below:.
The pass mark is taken at 50%.. It is seen that after the pretest the difference in
marks between the two clusters is statistically significant whereas after posttest this
difference is not statistically significant. This indicates that IDRLS has been
successful for remedial learning
Our next objective is to conduct a survey to find out the usefulness of IDRLS on
the failed students. Students were given a set of questions and then asked to give their
feedback using a 5 point Likert scale where 1 denotes strongly disagree, 2 denotes
disagree, 3 denotes neutral, 4 denotes agree and 5 denotes strongly. The survey
questions and their results are shown in Table 9.The above results show that IDRLS
has had an impact on the failed students
A Concept Map Approach to Supporting Diagnostic and Remedial Learning Activities 573
Pretest
Post Test
5 Conclusion
In this work we propose an Intelligent System which can identify the concepts the
students are weak in and suggest remedial lesson plan for these. The developed
system consists of three modules which are derived from Ausubel’s theory of
meaningful learning. A simplified version of this is implemented in M-Learning
environment using Android Emulator. The students who use this system provided a
good feedback of it. Thus the major research objective of this work has been twofold:
Firstly, to develop a diagnostic and remedial system based on the psychological
foundations of learning and implement it on a certain platform and more significantly,
to verify and validate that this form of remediation has indeed been useful for the
students. We plan to extend the M-Learning architecture so that this mechanism may
be accessed from PC s’ as well.
References
1. Novak, J.D., Alberto, C.J.: Theoretical Origins of Concept Map, How to construct them
and their used in Education. Reflecting Education 3(1), 29–42 (2007)
2. Pendidican, F.: Learning Theories. Mathematika dan IInam Alam, Universitas Pendidikan
Indonesia
574 A. Acharya and D. Sinha
3. Hwang, G.: A conceptual map model for developing Intelligent Tutoring Systems.
Computers and Education, 217–235 (2003)
4. Hwang, G.: A computer assisted approach to diagnosing student learning problems in
science courses. Journal of Information Science and Engineering, 229–248 (2007)
5. Pocatilu, P.: Developing mobile learning applications for Android using web services.
Informatica Economica 14(3) (2010)
6. Lee, C., Lee, G., Leu, Y.: Application of automatically constructed concept maps of
learning to conceptual diagnosis of e-learning. Expert Systems with Application 36(2),
1675–1684 (2009)
7. Virvou, M., Alepis, E.: Mobile Education features in authoring tools for personalized
tutoring. Computers and Education, 53–68 (2005)
8. Kazi, S.: Voca Test: An intelligent Tutoring System for vocabulary learning using M-
Learning Approach (2006)
9. Thompson, T.: The learning theories of David P Ausubel. The importance of meaningful
and reception learning
10. Johnson, W.B., Neste, L.O., Duncan, P.C.: An authoring environment for intelligent
tutoring systems. In: IEEE International Conference on Systems, Man and Cybernetics, pp.
761–765 (1989)
11. Vasandani, V., Govindaraj, T., Mitchell, C.M.: An intelligent tutor for diagnostic problem
solving in complex dynamic systems. In: IEEE International Conference on Systems, Man
and Cybernetics, pp. 772–777 (1989)
12. Psotka, J., Mutter, S.A.: Intelligent Tutoring Systems: Lessons Learned. Lawrence
Erlbaum Associates (1998)
13. Sai, C.: Data Mining Techniques for Identifying students at risk of failing a computer
proficiency test required for graduation. Australasian Journal of Educational
Technology 27(3), 481–498 (2012)
Data Prediction Based on User Preference
1 Introduction
Collaborative filtering (CF) [5] uses the notions of other users that are similar to an
active user, as a filter.Collaborative filtering (CF) are corresponding to content based
filtering and the main aim of learning models of user interests or preferences, activi-
ties from community data, that is, a data of available user preferences.The user input
or interaction beyond the profile created from previous interactions and annotations is
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 575
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_66, © Springer International Publishing Switzerland 2014
576 D. Kumar, G. Varshney, and M. Thakur
not mandatory.Until now, the dominant model for performing collaborative filtering
in recommendation systems (RS) [6] has been based on memory-based techniques or
nearest neighbor regression.All first generation RS uses the same fundamental
approach first identifying users that are alike to some active user for which recom-
mendations have to be fixed, and then compute predictions and recommendations
based on judgments and the predilections of these like-minded or similar users.
Content-based filtering (CBF) [7], [12] and retrieval builds on the essential post-
ulation that users are able to create queries that express their information or interests
needs in term of the essential characteristics of the items needed.It is difficult to iden-
tify appropriate descriptors such as themes, keywords, genres, etc., which can be used
to define interests. In some cases, for example E- commerce, users may be at least
inattentive or unacquainted of their interest. In both cases, it would like to recommend
items and predict user likings without requiring the users to explicitly formulate a
query.
The technique proposed is a generalization of a statistical technique is known as
probabilistic Latent Semantic Analysis (PLSA) [10] that was originally investigated
for information retrieval [11]. PLSA has some similarity with clustering methods in
which latent variables for user community are introduced, yet the community would
be overlapped and users are not partitioned. It is closely related to matrix decomposi-
tion techniques and dimension reduction technique such as Singular Value Decompo-
sition (SVD), which have been used in the context of recommendation systems [12],
[13].The main differences with regard to dependency networks and Bayesian network
is the fact that the latter learning structures directly on the observable, while PLSA is
based on a latent basis model that introduces the notion of user communities or
group of items.The advantage over SVD and PCA based dimension reduction me-
thods is this approach offers can build onstatistical techniques for inference, a proba-
bilistic semantics and model selection. However, this approach shares with all the
above techniques an assumption that predictions are calculated in the “user-centric”
perspective.
The paper also describes the implementation of probabilistic latent semantic index-
ing (PLSI) [2] and their application on a collation of movies data set. The recom-
mender system is to make available the user with recommendations that reflect the
user’s personal interest and to induce the user to expectation and explore the given
recommendations. Recommender system that creates personal recommendations at-
tains their goal by maintaining profiles forthe users that have their own preferences.
The user profiles are used as filters; items that match the user’s preferences will slide
through and be presented as recommendations.The systems that produce individual
recommendations as output and an effective guide tousers in the personalized way to
fascinating or suitable objects in a large space of alternatives.
doing some preprocessing (rating out of five, user profile, etc.) over the item in the
data set, we create a user / item matrix [1], [4] which act as a representation of de-
mand of items and it is considered to be the most appropriate form of input for most
of a recommender systems. The anatomy of a conventional recommender system is
depicted in Fig. 1.
The recommender system returns the list of items desired by the user against the
user interest based on a Similarity Function [8], which evaluates whether a particular
item is relevantly or similar to the item liked by the user. If the item is found to be
relevant or highly similar with respect to the user’s items, the items are added to the
final list of recommended items to be returned as results.
Database Similarity
The kernel of PLSA is a statistical model designated as the aspect model. The aspect
model has been formerly proposed in the domain of linguistic modeling, where it was
referred as an aggregate Markov model. Assuming an observation being the appear-
ance of a specific word in a particular user the aspect model associates a hidden or
latent class variable (denoting the concepts) ∈ ,…..., With each observa-
tion [2], [3]. We also acquaint ourselves with the following probabilities:
578 D. Kumar, G. Varshney, and M. Thakur
Fig. 2. Graphical model representation of the aspect model in the asymmetric Parameterization
Using the above mentioned probabilities one can define a generative model for the
accompaniment of the observed movie and user pair as pictured below along with Fig.
2 and Fig. 3.
• Select a user with probability P ( ).
• Pick a latent class or concept with probability P ( | ).
• Generate a movie with probability P ( | ).
Translating the aforementioned generative model we get an observed pair ( , ),
while discarding the latent class variable. Now, translating the aforementioned
process into the joint probability model we get the following expression.
Fig. 3. Graphical model representation of the aspect model in the symmetric parameterization
where, P ( | ) =∑ P( | )P ( | ),
( , )=∑ P( )P ( | )P ( | ) (3)
• E-Step or expectation step computes the posterior probability estimates of the la-
tent variables provided the current best estimates of the involved parameters.
• M-Step or maximization step modifies the parameters on the basis of expected
completion data log-likelihood function which depends on the posterior probability
computed in the E-Step.
In PLSA, the posterior probability of the latent variable computed in the E-Step is
as follows:
P( )P ( | )P ( | )
P( | , )= (4)
∑ P( )P ( | )P ( | )
∑ ( , )P ( | , )
P( | )=∑ ∑
(6)
( , )P ( | , )
P( )=
∑ ∑
∑ ∑ n , P , (7)
,
U Z M
4 Implementation Details
The paper aims to present an explicit comparative study of different successful ap-
proaches being deployed in diverse Recommender systems, and to make it viable
there is a need for a suitable software framework, where these approaches can be
tested on various platforms and it must be deployed in a way to support experimenta-
tion. Theapproach discussed in the paper is implemented in Java andon a standard PC
with 2.5GHZ CPU and @GB RAM. The data set used in implementation taken from
The Movie Len. Before proceeding further with the results, the next Section gives a
very concise outline of the data set used to test and experiment.
4.1 Dataset
The Movie data set being employed for the experimentation is a collection of movie
dataset. This data set consists of:
• Demographic information for the users (Name, sex (M/F), age, occupation, zip).
• 100,000 from 943 users on 1682 movie ratings (1-5).
• A useris considered as an active user if he/she has rated at least 20 movies.
The data collected through the MovieLens web site during the seven-month period
from September 19th, 1997 through April 22nd, 1998. This data have been
represented according to the requirement– users who did not complete profile infor-
mation or had less than 20 ratings were not considered.
Data Prediction Based on User Preference 581
The decomposition of user ratings may lead to the discovery of interesting patterns
and regularities that describe user interests as well as disinterest. To that extent, we
have to find a mapping of a quantitative PLSA model into a more qualitative descrip-
tion suitable for visualization. We propose to summarize and visualize each user
community, corresponding to one of the k possible states of the latent variable Z, in
582 D. Kumar, G. Varshney, and M. Thakur
the following way. We sort items within each community or interest group according
to their popularity within the community as measured by the probability P(y|z). The
highly popular items are used to characterize a community and we anticipate these
items to be descriptive of the types of items that are relevant to the user community.It
can also be understood that the accuracy of the model is quite miserable without the
user-specific, which is a crucial component of the proposed model.It is also quite
significant that this effect is obtained with an example involving a relatively small
number of communities. Table 1 displays the interest groups (Intested Group= GI)
extracted by a multinomial PLSA model with k = 15. We conclude from this that the
assumption of the user-specific rating scales encodes useful prior knowledge.
Overall, we believe that patterns and regularities extracted with PLSA models can
be helpful in understanding shared interests of users and correlations among ratings
for different items. The ability to automatically discover communities as part of the
collaborative filtering process is a trait which pLSA shares with only few other
methods such as clustering approaches, but which is absent in all memory-based
techniques.
6 Conclusion
References
1. Burke, R.: Hybrid recommender systems: Survey and experiments. User Modeling and
User-Adapted Interaction 12(4), 331–370 (2002)
2. Hofmann, T.: Probabilistic Latent Semantic Indexing. In: Annual ACM Conference on Re-
search & Development in Information Retrieval, pp. 50–57 (1999)
3. Landauer, K., Foltz, P.W., Laham, D.: An Introduction To Latent Semantic Analysis. Dis-
course Processes 25, 259–284 (1998)
4. Bilmes, J.A.: A Gentle Tutorial of the EM Algorithm and its Application to Parameter Es-
timation for Gaussian Mixture and Hidden Markov Models. International Computer
Science Institute (1998)
5. Adomavicius, G., Sankaranarayanan, R., Sen, S., Tuzhilin, A.: Incorporating Contextual
Information in Recommender Systems Using a Multidimensional Approach. ACM Trans-
actions in Information Systems 23(1) (2005)
6. Ansari, A., Essegaier, S., Kohli, R.: Internet Recommendations Systems. Journal of Mar-
keting Research, 363–375 (2000)
7. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley (1999)
8. Deshpande, M., Karypis, G.: Item-Based Top-N Recommendation Algorithms. ACM
Transactions on Information Systems 22(1), 143–177 (2004)
9. Billsus, D., Brunk, C.A., Evans, C., Gladish, B., Pazzani, M.: Adaptive Interfaces for
Ubiquitous Web Access. Communications of the ACM 45(5), 34–38 (2002)
10. Buhmann, M.D.: Approximation and Interpolation with Radial Functions. In: Dyn, N.,
Leviatan, D., Levin, D., Pinkus, A. (eds.) Multivariate Approximation and Applications,
Cambridge University Press (2001)
11. Burke, R.: Knowledge-Based Recommender Systems: Encyclopedia of Library and
Information Systems. In: Kent, A. (ed.), vol. 69 (supple. 32). Marcel Dekker (2000)
12. Adomavicius, G., Tuzhilin, A.: Toward the next generationof recommender systems: A
survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge
and Data Engineering 17(6), 734–749 (2005)
13. Caglayan, A., Snorrason, M., Jacoby, J., Mazzu, J., Jones, R., Kumar, K.: Learn Sesame -
A Learning Agent Engine. Applied Artificial Intelligence 11, 393–412 (1997)
Automatic Resolution of Semantic Heterogeneity
in GIS: An Ontology Based Approach
1 Introduction
Geospatial information is the key for any decision support system in the ge-
ographic information system (GIS) domain [1]. The discovery of suitable data
sources by keyword based search in the catalogue becomes inaccurate due to the
existing heterogeneity. Heterogeneity in the database can be categorized as syn-
tactical heterogeneity, structural heterogeneity and semantic heterogeneity. The
structural and syntactical interoperability in the database can be ensured by
standardizing data model (metadata) or meta-information regarding the data
format and structure. Open Geospatial Consortium (OGC) has standardized
geographic markup language (GML) based application schema as the basis for
sharing and integrating spatial data at service level. Although the specified stan-
dards ensure the syntactic interoperability in the data integration process, but
unable to address the semantic heterogeneity issue. In this regard, an ontology
based approach can be introduced to resolve semantic heterogeneity in GIS [2][3].
It is a key factor for enabling interoperability in the semantic web by allowing
the GIS applications to agree on the spatial concepts semantically when com-
municating with each other. Ontology can be regarded as “a specification of a
conceptualization. A conceptualization is an abstract, simplified view of the world
that we wish to represent for some purpose.”
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 585
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_67, c Springer International Publishing Switzerland 2014
586 S. Bhattacharjee and S.K. Ghosh
Many researches have been attempted in this field, however, there exists an
enormous scope for further work. The major hindrance in sharing the data across
the organization in the GIS application is their own proprietary standards for
data formats. Some initial attempts were made to obtain GIS interoperability
which involves the manual conversion of geographic data from one proprietary
format into another. Otherwise some centralized standard file format can be used
for this practice. However, these formats can also lead to some information loss
[4]. Buccella et al. [5] have studied different geographic information exchange
formats for the integration of data sources with the help of ontologies for rep-
resenting the semantics. Same is done through service oriented architecture in
[6]. Yue et al. [7] have addressed the issue of geospatial service chaining through
semantic web services for improving the automation. The tools like OntoMorph,
PROMPT [8] etc. have also tried to automate the process of semantic query res-
olution. However, the component that automatically identifies the semantically
similar concepts in the database, is also missing here. In this regard, the main
objectives of this work can be stated as follows,
– Building the spatial feature ontology with all possible features in the avail-
able data repositories.
– Three level semantic matching (database, relational, and attribute based)
between the user query concept and the concepts in the ontology.
This work focuses on the requirement of dynamic mapping between semanti-
cally similar concepts of the user query and the spatial databases. It facilitates
the semantic search using some thesaurus based word relators like WordNet [9],
Nexus [10] etc. An algorithm has been proposed which iteratively searches for
the similar concepts from any database in the pool of repositories, by consult-
ing the ontology, by facilitating the semantic search. Three levels of checking
is introduced here. They are in database, relation and attribute level. Semantic
matching is done for all the three levels in succession until a proper match is
found. This approach is more efficient than the traditional matching with spatial
features, represented as the relational schema in the databases.
This paper is organized in four sections. Section 1 gives an overview about the
semantic heterogeneity related to GIS. For automated ontology based semantic
matching between spatial features in the data repositories and the query concept,
a suitable framework and along with a methodology is presented in the Section
2. Corresponding implementation procedure and the results are shown in Section
3. Finally, the conclusion is drawn in Section 4.
ture. Each repository publish its own featureDescription through Web Feature
Service (WFS)
which ensures structural and syntactic interoperability (as GML is the stan-
dard) but semantic heterogeneity issues still exist. All these features are popu-
lated in the catalogue for suitable indexing and the retrieval of data which can
be extracted through catalog service for the web (CSW)2 . Semantic query reso-
lution engine consists of two components, a semantic concept matching engine
and a thesaurus based word relator which provide spatial vocabularies and their
semantic associations. Semantic concept matching engine is where the proposed
algorithm resides. When a user requests to retrieve the suitable metadata for
any geospatial feature, it will be processed via a semantic query resolution en-
gine. Here, the proposed algorithm consults the thesaurus based word relator
to retrieve all similar concepts from the databases. In this study, WordNet is
used which can be interpreted as a lexical ontology. It then identifies the similar
concept in one or more than application ontology that corresponds to a spe-
cific WFS of a particular data source. Once the semantically similar concept is
found, the actual data can be fetched that corresponds to the user query. The
details of the semantic matching method is described in Algorithm 1. Some use-
ful terminologies related to this algorithm are synonym 4 , hyponym 4 , hypernym 4 ,
meronym.
First, the direct matching of the query concept with the database based on
keyword based searching is done. If any matches found, it directly returns that
concept as the output. Else, it calls Algorithm 1 with three arguments (x, y,
z), where x is the input concept, y is the corresponding level (e.g., database,
relation, attribute, etc.) of database where the semantic matching is supposed
to be carried out and z is the whole WordNet dictionary. In semantic matching
process, the concept is matched with the synset and gloss of each of y in WordNet.
The jaccard coefficient is used to calculate the similarity between x and y. If no
match found, a threshold value l is chosen. The hypernym or hyponym of y
588 S. Bhattacharjee and S.K. Ghosh
Semantic_match(x, y, z)
{
x = root(x);
y = root(y);
foreach y.synset ∈ z do
if ((x = y.synset) Jaccard(x, y.semantic)) = 0 then
Match found;
Return y;
end
else
y1 = y.hypernym;
y2 = y.hyponym;
foreach i = 1 → l do
if x = y1 .synset (Jaccard(x, y1 .synset)) >0 x = y2 .synset
Jaccard(x, y2 .synset) >0 then
Match found;
Return y;
Break;
end
else
y1 = y1 .hypernym;
y2 = y2 .hyponym;
end
end
if i = l + 1 then
Continue;
end
else
Break;
end
end
end
if all y.synset is processed & no match found then
Return NULL;
end
}
Algorithm 1. Semantic matching using word relator
upto l level is matched with x. If this level also could not find any match, it
tries to match x in its subsequent level, otherwise y’s corresponding concept is
returned as the output. If the query concept is not found to be matching with
any attributes of the concepts in the repository pool also, it is assumed that no
database matches with query concept and the algorithm returns the output as
NULL.
Automatic Resolution of Semantic Heterogeneity in GIS 589
This section identifies some implementation details for the evaluation of the
proposed algorithm. Protégé 3.3.1 [11] is used as the ontology editor. OntoLing
is a Protégé plug-in that allows for linguistic enrichment of the ontology concepts.
The ontology used for this case study is built from the spatial features related
to the Bankura district of West Bengal, India. It consists of the spatial features
like road, land coverage, water-body, facility etc. These features are considered
as the concepts in the ontology hierarchy and divided into sub-concepts. The
hierarchy along with its instances is shown in Fig. 2.
The synonyms, hypernym and homonyms of the query string are populated
in rdf:label of each of the concept in the ontology. This matching process is same
as the relational level matching. rdf:comment represents the meaning of the
database concepts. The jaccard similarity measure corresponds to the matching
with the attribute values. Ontology is updated with all synonyms, hypernyms,
homonyms and meronyms using OntoLing plug-in. For this study, the threshold
value l is taken as 2; i.e., upto 2nd level hypernyms and hyponyms of the query
string in the WordNet tree, are considered for matching, however, this threshold
can be varied as per the application requirement. Table 1 shows some example
cases of semantic matching for this given case study. Fig. 3. depicts the semantic
matching process in Protégé for the query string “Highway”. Proposed method
returns the spatial feature “Road” as the matched concept in the database, as
“Highway” is the hyponym of “Road”.
4 Conclusion
This paper proposes a semantic matching algorithm between user queries and
spatial databases using ontology. The whole matching process is divided into
three levels, database, relation and attribute, respectively. It is a multi-strategy
method based on direct and the semantic matching by linguistic-similarity be-
tween the user query concepts and the database features. This matching is au-
tomated and does not need any manual intervention. Building the application
ontology from each databases in the repository, followed by the construction of
hybrid ontology, can be considered to be the future prospects of this work.
References
1. Paul, M., Ghosh, S.: A service-oriented approach for integrating heterogeneous spa-
tial data sources realization of a virtual geo-data repository. International Journal
of Cooperative Information Systems 17(01), 111–153 (2008)
2. Bhattacharjee, S., Prasad, R.R., Dwivedi, A., Dasgupta, A., Ghosh, S.K.: Ontology
based framework for semantic resolution of geospatial query. In: 2012 12th Inter-
national Conference on Intelligent Systems Design and Applications (ISDA), pp.
437–442. IEEE (2012)
3. Bhattacharjee, S., Mitra, P., Ghosh, S.: Spatial interpolation to predict missing
attributes in gis using semantic kriging. IEEE Transactions on Geoscience and
Remote Sensing 52(8), 4771–4780 (2014)
4. Fonseca, F.T., Egenhofer, M.J.: Ontology-driven geographic information systems.
In: Proceedings of the 7th ACM International Symposium on Advances in Geo-
graphic Information Systems, GIS 1999, pp. 14–19. ACM (1999)
5. Buccella, A., Cechich, A., Fillottrani, P.: Ontology-driven geographic information
integration: A survey of current approaches. Computers & Geosciences 35(4), 710–
723 (2009)
6. Alameh, N.: Service chaining of interoperable geographic information web services.
Internet Computing 7(1), 22–29 (2002)
Automatic Resolution of Semantic Heterogeneity in GIS 591
7. Yue, P., Di, L., Yang, W., Yu, G., Zhao, P., Gong, J.: Semantic web services-based
process planning for earth science applications. International Journal of Geograph-
ical Information Science 23(9), 1139–1163 (2009)
8. Klein, M.: Combining and relating ontologies: an analysis of problems and solu-
tions. In: IJCAI-2001 Workshop on Ontologies and Information Sharing, pp. 53–62
(2001)
9. Miller, G.A.: Wordnet: a lexical database for english. Communications of the
ACM 38(11), 39–41 (1995)
10. Jannink, J.F.: A word nexus for systematic interoperation of semantically hetero-
geneous data sources. PhD thesis, stanford university (2001)
11. Gennari, J.H., Musen, M.A., Fergerson, R.W., Grosso, W.E., Crubézy, M., Eriks-
son, H., Noy, N.F., Tu, S.W.: The evolution of protégé: an environment for
knowledge-based systems development. International Journal of Human-Computer
Studies 58(1), 89–123 (2003)
Web-Page Indexing Based on the Prioritized
Ontology Terms
Abstract. In this world, globalization has become a basic and most popular
human trend. To globalize information, people are going to publish the docu-
ments in the internet. As a result, information volume of internet has become
huge. Tohandle that huge volume of information, Web searcher uses search en-
gines. The Web-page indexing mechanism of a search engine plays a big role to
retrieve Web search results in a faster way from the huge volume of Web re-
sources. Web researchers have introduced various types of Web-page indexing
mechanism to retrieve Web-pages from Web-page repository. In this paper, we
have illustrated a new approach of design and development of Web-page index-
ing. The proposed Web-page indexing mechanism has been applied on domain
specific Web-pages and we have identified the Web-page domain based on an
Ontology. In our approach, first we prioritize the Ontology terms that exist in
the Web-page content then apply our own indexing mechanism to index that
Web-page. The main advantage of storing an index is to optimize the speed and
performance while finding relevant documents from the domain specific search
engine storage area for a user given search query.
1 Introduction
In recent years, the growth of the World Wide Web (WWW) has been rising at an
alarming rate and contains a huge amount of multi-domain data [1]. As a result, there
is an explosion in information and web searcher uses search engines to handle that
information. There are various parameters used by the search engines to produce bet-
ter search engine performance, Web-page indexing is one of them. Nowadays, Web
researchers have already introduced some efficient Web-page indexing mechanism
like Back-of-the-book-style Web-page indexes formally called “Web site A-Z index-
es”, “Human-produced Web-page index”, “Meta search Web-page indexing”, “Cache
based Web-page indexing”, etc [2].
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 593
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_68, © Springer International Publishing Switzerland 2014
594 S. Sinha, R. Dattagupta, and D. Mukhopadhyay
Definition 1.1: Dominating Ontology Term- Ontology term which holds maximum
Ontology term relevance value in the considered Web-page.
Definition 1.2: Sub-dominating Ontology Terms- Ontology terms which hold suc-
cessive maximum Ontology term relevance values other than dominating Ontology
term in the considered Web-page.
Rule 1.1: Primary Attachment (P1, P2 …) – All the dominating Ontology terms for
all Web-pages are indexed with the primary attachment of their respective Ontology
term.
Rule 1.2: Secondary Attachment (S1, S2 …) - All the sub-dominating Ontology
terms for all Web-pages are indexed with the secondary attachment of their respective
Ontology term.
2 Related Works
The main advantage of storing an index is to optimize the speed and performance
while finding relevant documents from the search engine storage area for a user given
search criteria. In this section, we are going to discuss the existing Web-page indexing
mechanism and their drawbacks.
Definition 2.1: Ontology –It is a set of domain related key information, which is kept
in an organized way based on their importance.
Definition 2.2: Relevance Value –It is a numeric value for each Web-page, which is
generated on the basis of the term Weight value, term Synonyms, number of occur-
rences of Ontology terms which are existing in that Web-page.
Definition 2.3: Seed URL –It is a set of base URL from where the crawler starts to
crawl down the Web pages from the Internet.
Web-Page Indexing Based on the Prioritized Ontology Terms 595
Definition 2.4: Weight Table – This table has two columns, first column denotes
Ontology terms and second column denotes weight value of that Ontology term. On-
tology term weight value lies between ‘0’ and ‘1’.
Definition 2.5: Syntable - This table has two columns, first column denotes Ontology
terms and second column denotes synonym of that ontology term. For a particular
ontology term, if more than one synonym exists, those are kept using comma (,) sepa-
rator.
Definition 2.6: Relevance Limit –It is a predefined static relevance cut-off value to
recognize whether a Web-page is domain specific or not.
Definition 2.7: Term Relevance Value – It is a numeric value for each Ontology
term, which is generated on the basis of the term Weight value, term Synonyms,
number of occurrences of that Ontology term in the considered Web-page.
Back-of-the-book-style Web-page indexes formally called “Web site A-Z indexes”.
Web site A-Z indexes have several advantages. But search engines language is full of
homographs and synonyms and not all the references found will be relevant. For ex-
ample, a computer-produced index of the 9/11 report showed many references to
George Bush, but did not distinguish between “George H. W. Bush” and “George W.
Bush” [5].
Human-produced index has someone check each and every part of the text to find
everything relevant to the search term, while a search engine leaves the responsibility
for finding the information with the enquirer. It will increase miss and hit ratio. This
approach is not suitable for the huge volume of Web data [6].
Metadata Web indexing involves assigning keywords or phrases to Web-pages or
websites within a meta-tag field, so that the Web-page or website can be retrieved by
a search engine that is customized to search the keywords field. This may be involved
using keywords restricted to a controlled vocabulary list [7].
Cache based webpage indexing has produced search result quickly because the re-
sult information stored into cache memory. On the other hand while an irregular
search string encountered, the search engine cannot produce faster search result due to
information not available in the cache memory. Irregular search strings always come
because of the huge volume of internet information and user [8-9].
3 Proposed Approach
In our approach, we have proposed a new mechanism for indexing domain specific
Web-pages. Before going forward with the new indexing mechanism, we need to
make sure all the inputs are available in our hands. Those inputs are domain specific
Web-page repository, set of Ontology terms, Weight table and Syntable [10]. One of
our earlier work, we have created the domain specific Web-page repository [11]. We
have used that repository as an input of our proposed approach.
596 S. Sinha, R. Dattagupta, and D. Mukhopadhyay
In this section, we will discuss how to extract dominating and sub-dominating Ontol-
ogy terms. We will illustrate this by using one example Fig. 1.
Consider a ‘Mobile’ domain Web-page. First extract the Web-page content then
apply definition 1.1 and 1.2. We have found that Ontology term ‘Mobile’ holds term
relevance value 45, which is maximum and according to our definition 1.1 Ontology
term ‘Mobile’ becomes dominating Ontology term. Ontology term ‘price’, ‘color’,
‘battery’ and ‘company’ holds term relevance value 31, 27, 18 and 15 respectively,
which are greater than all other Ontology terms excluding ‘Mobile’ Ontology term.
Now according to our definition 1.2, Ontology term ‘price’, ‘color’, ‘battery’ and
‘company’ become sub-dominating Ontology term 1, sub-dominating Ontology term
2, sub-dominating Ontology term 3 and sub-dominating Ontology term 4 respectively.
If number of sub-dominating Ontology term increased then secondary attachments
also increases proportionally to store them (refer Rule 1.2), which increases indexing
memory size. For that reason, we have used four sub-dominating Ontology terms as a
threshold value. Some rare cases, we found multiple Ontology term holds same term
relevance value that time we will prioritize dominating and sub-dominating Ontology
terms according to their lower term weight value, i.e., consider the higher value of the
number of occurrences of that Ontology term in the considered Web-page content.
Web-page retrieval from Web search engine resources are an important role of a Web
search engine. We are retrieving a resultant Web-page list from our data store based
on the user given dominating and sub-dominating Ontology terms, relevance range,
etc. According to our prototype, we are giving a flexibility to the user does not use the
search string, directly select the search tokens from the drop down lists. As a result, it
reduces the search string parsing time and miss hit ratio due to user’s inadequate do-
main knowledge. Our prototype uses below formula to produce a resultant Web-page
list based on the user given relevance range.
(50% of ‘x’ from the primary attachment list of dominating Ontology term +
20% of ‘x’ from secondary attachment list of first sub-dominating Ontology term +
15% of ‘x’ from secondary attachment list of second sub-dominating Ontology term +
10% of ‘x’ from secondary attachment list of third sub-dominating Ontology term +
5% of ‘x’ from secondary attachment list of fourth sub-dominating Ontology term),
where ‘x’ is a numeric value given by user for number of search results want to see in
the result page.
4 Experimental Analyses
In this section, we have given some experimental study as well as discussed how to
set up our system. Performance of our system depends on various parameters, and
those parameters need to be setup before running our system. The considered parame-
ters are domain relevance limit, weight value assignment, Ontology terms, domain
specific Web-page repository, etc. These parameters are assigned by tuning our sys-
tem through experiments. Section 4.1 depicts our prototype time complexity to pro-
duce resultant Web-page list and Section 4.2 shows the experimental results of our
system.
We have considered ‘k’ number of Ontology terms. We have kept them in a sorted
order according to their weight value. While finding user given dominating Ontology
term primary attachment link, our prototype required maximum O(log2k) time using
binary search mechanism (refer Fig. 2). On the other hand while finding other four
user given sub-dominating Ontology term secondary attachment links, our prototype
required 4O(log2k) times. In the second level, our prototype reached from primary
and secondary attachment to the Web-pages just spending constant time because there
is no iteration required. Finally, our prototype time complexity becomes [5O(log2k)
+5c] ≈ O(log2k) to the retrieve resultant Web-page list, where ‘c’ is a constant time
required to reach the primary and secondary attachment to Web-pages.
Web-Page Indexing Based on the Prioritized Ontology Terms 599
It is very difficult to compare our search results with the existing search engines. Most
of the cases, existing search engines do not hold domain specific concepts. It is very
important that while comparing two systems both are on the same page, i.e., contains
same resources, environment, system platforms, search query all are same. Few exist-
ing cases, where search engine gives an advanced search option to the Web searchers,
but not match with our domains. Anyhow we have produced few data to measure our
proposed prototype performance. To produce the experimental results, we have com-
pared the two systems (before and after applying Web-page indexing mechanism)
performances. In table 1, we have given a performance report of our system. To
measure accuracy, we have applied our set of search query multiple times, which has
shown in table 2.
Number of Avg. No. of Re- Avg. No. of Non- Total Number of Web-
Search Results levant Results Relevant Results pages in the Repository
10 8.7 1.3 5000
20 17.2 2.8 5000
30 26.4 3.6 5000
40 34.6 5.4 5000
50 43.6 6.4 5000
5 Conclusions
In this paper, we have proposed a prototype of a domain specific Web search engine.
This prototype has used one dominating and four sub-dominating Ontology terms to
produce Web search results. All the Web-pages are indexed according to their domi-
nating and sub-dominating Ontology terms. According to our experimental results,
Web-page indexing mechanism produced faster result for the user selected dominat-
ing and sub-dominating Ontology terms. According to our prototype, we are giving a
flexibility to the user does not use the search string, directly select the search tokens
600 S. Sinha, R. Dattagupta, and D. Mukhopadhyay
from the drop down lists. As a result, it reduces the search string parsing time and
miss hit ratio due to user’s inadequate domain knowledge.
This prototype is highly scalable. Suppose, we need to increase the supporting do-
mains for our prototype, then we need to include the new domain Ontology and other
details like weight table, syntable, etc. of that Ontology. In a single domain there does
not exist huge number of ontology terms. Hence, the number of indexes should be
lesser than a general search engine. As a result, we can reach the web-pages quickly
as well as reducing index storage cost.
References
1. Willinger, W., Govindan, R., Jamin, S., Paxson, V., Shenker, S.: Scaling phenomena in the
Internet. Proceedings of the National Academy of Sciences, 2573–2580 (1999)
2. Diodato, V.: User preferences for features in back of book indexes. Journal of the Ameri-
can Society for Information Science 45(7), 529–536 (1994)
3. Spyns, P., Meersman, R., Jarrar, M.: Data modelling versus ontology engineering.
SIGMOD Record Special Issue 31(4), 12–17 (2002)
4. Spyns, P., Tang, Y., Meersman, R.: An ontology engineering methodology for DOGMA.
Journal of Applied Ontology 5 (2008)
5. Diodato, V., Gandt, G.: Back of book indexes and the characteristics of author and
non-author indexing: Report of an exploratory study. Journal of the American Society for
In-formation Science 42(5), 341–350 (1991)
6. Anderson, J.D.: Guidelines for Indexes and Related Information Retrieval Devices. NISO
Technical Report 2, NISO-TR02-1997 (1997)
7. Manoj, M., Elizabeth, J.: Information retrieval on Internet using meta-search engines: A
review. CSIR, 739–746 (2008)
8. Brodnik, A., Carlsson, S., Degermark, M., Pink, S.: Small forwarding tables for fast
routing lookups. In: Proceedings of ACM SIGCOMM 1997 (1997)
9. Chao, H.J.: Next Generation Routers. IEEE Proceeding 90(9), 1518–1558 (2002)
10. Gangemi, A., Navigli, R., Velardi, P.: The OntoWordNet Project: Extension and Axioma-
tization of Conceptual Relations in WordNet. In: Meersman, R., Schmidt, D.C. (eds.)
CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 820–838. Springer,
Heidelberg (2003)
11. Mukhopadhyay, D., Biswas, A., Sinha, S.: A New Approach to Design Domain Specific
Ontology Based Web Crawler. In: 10th International Conference on Information Technol-
ogy, pp. 289–291 (2007)
A Hybrid Approach Using Ontology Similarity and
Fuzzy Logic for Semantic Question Answering
1 Introduction
The Educational Semantic web aims to discover knowledge using educational learning
areas such as personal learning, education administration and knowledge construction
[1]. Semantic web (web 3.0) is providing data integrity capabilities by not only machine
readability but also machine analysis. Education is improving by using Semantic web
approaches like large number of Online student sharing data semantically also student
portal help student to be connected everywhere to the update of class. Electronic text-
book [2] provides open context from source like openstax, ck12.org, crowd sourcing,
NCERT etc. Massive open online courses (Moocs) example coursera, udacity, khan,
Edx, TED-Ed also small virtual classes are found on Internet easily.
The web is naturally Fuzzy in nature, so text document and building Ontogy requires
Fuzzy approach. To improve the education Semantic web, the first step is a Semantic
question answering systems where uncertain words are questions. To implement such
as a system, a Fuzzy Ontology approach can be utilized by use Fuzzy logic (Fuzzy
type-1, Fuzzy type-2) [3] levels for text retrieval. A Fuzzy Scale is proposed for two
levels, first for membership of document (Fuzzy type-1) and membership of the word
(the words having uncertainty (synonyms) as Fuzzy type-2). A Fuzzy Co-clustering
algorithm is used to simultaneously cluster documents and words and hence handle
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 601
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_69, c Springer International Publishing Switzerland 2014
602 M. Rani, M.K. Muyeba, and O.P. Vyas
2 Background
2.2 Ontology
2. Inter uncertainty: This is uncertainty that a group of people have about the word.
The proposed methodology uses Fuzzy concepts like linguistic variable and Fuzzy
type-2 for information retrieval. In Fuzzy type-2 Model can deal with the uncertainty of
words. Fuzzy type-2 reduces to Fuzzy type-1 in case where there is no uncertainty exit
in scenario. Ontologies play important role in Information extraction.
Ontology represents knowledge in a graph conceptual diagram using Semantic ap-
proach rather than Syntactic approach where each node show either document or word.
Various Ontology match a user query and finally retrieves the Ontology for the query
Knowledge based (short - path), corpus based (Co-occurrence), Information content and
probability of encountering an instance. Then Ontology matching is used as a solution
to the Semantic heterogeneity problem. Applying reasoning from an Ontology to text
data play an important role in question answering system.
Ontology Similarity:
An edge count method can be used for calculating similarity [9] between a keyword
question and hierarchical ontology tree to obtain Semantic relations. For two similar
words, the return value is 1 whereas for two dissimilar words return 0 represented as an
equation:
St (t1,t2) = (exd − 1)/(exd + eys − 2)
Where d = depth of tree, S= shorted path length, x and y are smoothing factors and
St (t1,t2) = similarity value ranging from 0 to 1.
604 M. Rani, M.K. Muyeba, and O.P. Vyas
Protégé OWL plug-in [10] shows a major change in describing information of vari-
ous Ontology by adding new facilities. OWL Ontology can be categorized as OWL Lite,
OWL Full and OWL DL [10]. OWL DL can be considered as the extension of OWL
Lite. Similarly OWL full is an extension to OWL DL. Semantic web use RDB2onto,
DB2OWL and check d2rq etc. to match between Ontology and database. Ontologies
do not only represent lexical knowledge, but complex world knowledge about events.
Ontologies can be created by Protégé tools, software and after that, use Protégé Java
API or translate the Ontology into a rule base using Fuxi [11].
3 Methodology Description
The user enters a source string as a question. The first objective of the machine is to
Syntactically analyze the text from the source. Only after that the Lexical Analysis can
be done for each term in a question and then they are tokenized by removing stop words
present in the user’s question. The next step is linguistic preprocessing; POS (part of
speech) are tagged in such a way that Syntactic analysis can be done easily as shown
in Fig. 1, as flow diagram. In POS, a tree is created to differentiate between each ques-
tion term and label; each term is labeled as a noun, a verb or adjective. The Structural
sequence is identified by POS. Then questions can be interpreted for its Semantic mean-
ing. WordNet tool shows the results for all available synonyms of such word which are
nouns and verbs. This tool represents knowledge which is also useful for creating a
lexical Ontology for the domain knowledge. A word can be processed Semantically by
WordNet tools. The groups of words describing the same intension are called synsets.
The edge-count method is used to match for question Similarity with the existing Ontol-
ogy. Fuzzy Co-cluster is used to present collection of answers and Fuzzy scale (Fuzzy
type-1 for document and Fuzzy type-2 for words) in order to score the collection ob-
tained by Fuzzy Co-clustering. The final result is the matrix where x-axis represents
“Ontology Similarity” and y -axis represents “keywords”.
Our proposed Algorithm is as follows:
– Input text in search engine (Question).
– Parse the question for structural analysis.
– Remove stop words for keyword extraction.
– Use WordNet tool to get synonyms of a word in the keyword. Generate all possible
combinations of synonyms.
– Retrieval is based on the Semantic Ontology Similarity (edge-count method) match
for question; where question is matched with the answer on the basis of existing
Ontologies.
A Hybrid Approach Using Ontology Similarity and Fuzzy Logic 605
– Result is obtained from the matrix where the x-axis represents “Ontology similar-
ity” and y-axis represents “keywords”.
– Use Fuzzy Co-cluster to retrieve answers by using Semantic Ontology Similarity.
– Retrieve the final answer from matrix by prioritizing answers obtained by Fuzzy
Co- clustering using Fuzzy scale.
Fuzzy Co-cluster manages data and features into two or more clusters at the same
point of time. Here it can be observed overlapping structure of web documents is repre-
sented in the cluster with the degree of belongingness for each web document. Reasons
to choose the proposed Fuzzy Co-clustering in our case are:
a. Fuzzy Co-clustering is a technique to manage cluster data (Document) and features
(Words) [12] into two or more clusters at the same point of time. Here bi-clustering
(Co-clustering) has the ability to capture overlap between web documents and words
mentioned in the documents. The degree of belongingness for each document and word
are mentioned in Co-clustering.
b. The Fuzzy Co-clustering has the following advantages over the traditional cluster-
ing:
1. Dimensionality Reduction as the feature is stored in the overlapping form for vari-
ous clusters.
2. Fuzzy Co-clustering provides efficient results in situations which are vague and
uncertain.
3. Interpretability of document clusters becomes easy.
4. Improvement in accuracy due to local model of clustering.
5. Fuzzy membership functions improve representation of overlapping clusters in an-
swers by using Semantic Ontology Similarity.
606 M. Rani, M.K. Muyeba, and O.P. Vyas
c. Fuzzy type-2 deals with 3-D (three dimensional data) while FCC STF [12] algorithm
has the ability to deal with problem of curse of dimensionality and outliers.
d. Fuzzy Co-clustering concept is used in algorithm like FCCM, Fuzzy codok and
FCC STF as describe in Table 2. FCC STF is found to be the best in comparison to
FCCM and Fuzzy codok with the new single term fuzzifier approach. FCC STF is a
solution to the curse of dimensionality and outliers.
Calculating Score:
Score = (Membership of document (A) + Membership of Word (Ã))/Number of doc-
ument (N) = (A+Ã/N). Upper membership function for word (µ = 0.69). Lower mem-
bership function for word (µ = 0.61).
Fuzzy type-2 is used for computation of the Word as it has the ability to deal with
linguistic uncertainty. Whereas Fuzzy type-1 has the crisp membership like for doc-
ument (µ = 0.7). Fuzzy type-2 has a Fuzzy membership for synonymous words (µ =
0.61 − 0.69), it can be called as Fuzzy-Fuzzy set. Here the computation of word is
applied to find appropriate synonym for each question. An Exact synonym helps in ob-
taining the meaning of the question. So to retrieve appropriate answer Semantic analysis
of each query term along with synonyms is a must.
“sweet” is a vague term which we use in our common life every day in common
language. The term sweet depends on perception based assessment. The Same word
“sweet” has different meanings. When a user types the term “sweet” in the search engine
as question this term is treated as a vague term. But uncertainty arises in associating
the word “sweet” particularly to sugar. Here uncertainty can arises because the term
”sweet” can be associated to describe behaviors like kind, melodious, musical not only
to the sugar. In Fig. 3, various memberships of word “sweet” are described. Let us
consider the following statements where the term “sweet” needs to be checked for a
similar context with respect to its meaning, for which Fuzzy linguistic rules can be
applied. Then according to the context of the word membership of word can be applied.
For example:
– “Sarah is such a sweet little girl. she’s always looking after her brother.” - Kindly
(µ= 0.63).
– “This tea is too sweet for me to drink, how much sugar is in it?” - sugary (µ= 0.66).
608 M. Rani, M.K. Muyeba, and O.P. Vyas
References
1. Ohler, J.: The semantic web in education. Educause Quarterly 31(4), 7–9 (2008)
2. Agrawal, R.: Computational education: The next frontier for digital libraries (2013)
3. Mendel, J.M.: Type-2 fuzzy sets and systems: an overview. IEEE Computational Intelligence
Magazine 2(1), 20–29 (2007)
4. Gallova, S.: Fuzzy ontology and information access on the web. IAENG International Journal
of Computer Science 34(2) (2007)
5. Kwok, C., Etzioni, O., Weld, D.S.: Scaling question answering to the web. ACM Transactions
on Information Systems 19(3), 242–262 (2001)
6. Guo, Q., Zhang, M.: Question answering system based on ontology and semantic web. In:
Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008.
LNCS (LNAI), vol. 5009, pp. 652–659. Springer, Heidelberg (2008)
7. Kaladevi, A.C., Kangaiammal, A., Padmavathy, S., Theetchenya, S.: Ontology extraction
for e-learning: A fuzzy based approach. In: 2013 International Conference on Computer
Communication and Informatics (ICCCI), pp. 1–6 (2013)
8. Mendel, J.: Fuzzy sets for words: why type-2 fuzzy sets should be used and how they can be
used. presented as two-hour tutorial at IEEE FUZZ, Budapest, Hongrie (2004)
9. Benamara, F., Saint-Dizier, P.: Advanced relaxation for cooperative question answering. In:
New Directions in Question Answering. MIT Press, Massachusetts (2004)
10. Bobilloa, F., Stracciab, U.: Aggregation operators for fuzzy ontologies. Applied Soft Com-
puting 13(9), 3816–3830 (2013)
11. Lord, P.: The semantic web takes wing: Programming ontologies with tawny-owl. arXiv
preprint arXiv:1303.0213 (2013)
12. Rani, M., Kumar, S., Yadav, V.K.: Optimize space search using fcc stf algorithm in fuzzy co-
clustering through search engine. International Journal of Advanced Research in Computer
Engineering & Technology 1, 123–127 (2012)
Ontology Based Object-Attribute-Value
Information Extraction from Web Pages
in Search Engine Result Retrieval
Abstract. In this era, search engines are acting as a vital tool for users
to retrieve the necessary information in web searches. The retrieval of web
page results is based on page ranking algorithms working in the search
engines. It also uses the statistical based search techniques or content
based information extraction from each web pages. But from the analysis
of web retrieval results of Google like search engines, it is still difficult
for the user to understand the inner details of each retrieved web page
contents unless otherwise the user opens it separately to view the web
content. This key point motivated us to propose and display an ontology
based O-A-V (Object-Attribute-Value) information extraction for each
web pages retrieved which will impart knowledge for the user to take the
correct decision. The proposed system parses the users’ natural language
sentence given as a search key into O-A-V triplets and converts it as a
semantically analyzed O-A-V using the inferred ontology. This conversion
procedure involves various proposed algorithms and each algorithm aims
to help in building the taxonomy. The ontology graph has also been
displayed to the user to know the dependencies of each axiom given in
his search key. The information retrieval based on this proposed method
is evaluated using the precision and recall rates.
1 Introduction
Web is huge but it is not intelligent enough to understand the queries made by
the user and relate them real or abstract entities in the world. It is a collection
of text documents to other resources, linked by hyperlinks and URLs (Uniform
Resource Locator).
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 611
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_70, c Springer International Publishing Switzerland 2014
612 V. Vijayarajan, M. Dinakaran, and M. Lohani
Semantic web is the next level of web which treats it as a knowledge graph rather
than a collection of web resources interconnected with hyperlinks and URLs.
It also aims at adding semantic content to web pages and providing machine
processable semantics. With these, web agents will be able to perform complex
operations on behalf of the user.
“Semantic Web is about common formats for integration and combination of
data drawn from diverse sources and how the data relates to real world objects.
It provides a common framework that allows data to be shared and reused across
applications, enterprise and community boundaries” [1].
Linked data describes a method of publishing structured data so that it can
be interlinked and become more useful. Rather than using Web Technologies to
serve web pages for human readers, it uses these technologies to share informa-
tion in a way that can be read automatically by computers enabling data from
different sources to be connected and queried [2].
Reasoning is the capacity for consciously making sense of things, applying
logic, for establishing and verifying facts, and changing or justifying practices,
institutions, and beliefs based on new or existing information [3]. With Semantic
web intelligence, the web agents will be able to reason the content on web and
draw inferences based on the relations between various web resources.
1.2 Ontology
1.4 WordNet
The WordNet [6] is a large lexical database of English Language. It groups
closely related words into an unordered set called Synsets which are interlinked
by means of conceptual-semantic and lexical relations. Although it is considered
as an upper ontology by some, WordNet is not strictly an ontology. However, it
has been used as a linguistic tool for learning domain ontologies.
614 V. Vijayarajan, M. Dinakaran, and M. Lohani
2 Related Work
Since most of the information available on the web is in natural language and
not machine understandable there is no way to understand the data and draw
out semantic inferences. Ontologies can be used to model the information in a
way that could be easily interpreted by machines.
On passing the text through the proposed model, it is broken down into clauses
which are then tokenized and passed through the WordNet analyzer. The Word-
Net analyzer provides characteristic properties for each lemma like POS(part of
speech), synonyms, hypernyms, hyponyms, etc. Later on, an object is created
for each of these individuals and is added to the ontology. On passing the clause
Ontology Based Object-Attribute-Value Information Extraction 615
through the triplet extractor it continuously searches for nested and direct rela-
tions using the existing ontology. the extracted O-A-V triplets are then passed
through a semantic analyzer which determines the true form of the various ob-
jects in the O-A-V triplet based on the context in which it has been used. These
triplets and updated individuals are added to the ontology along with the gen-
eration of a taxonomy. At the end of all these processes a well defined semantic
network is developed which can then be used to enhanced search engine web
results providing the user with a completely enhanced search experience. Refer
Section 3 for more details on ‘Enhancing User Search Experience’.
3.2 Algorithms
For extracting nested relations like X’s Y’s Z, the triplet extractor continuously
checks for relations creating an empty individual which can be later on updated
based on their future occurance. The individuals are then classified based on the
616 V. Vijayarajan, M. Dinakaran, and M. Lohani
Data: NP
Result: compound entities,O-A-V Triplets
while not end of NP do
if next token ∈/ N then
create the current token as individual in ontology;
else
create O-A-V triplet between current token and next token with V as a
combination of both;
update current token with value of V;
set class of V with the value of class of A;
end
end
Algorithm 2. Extracting compound entities from NP, O-A-V represents
Object-Attribute-Value triplet
context in which they are used, like“Tommy” will represent a “dog” based on
the relation “Sam’s dog Tommy” but not on the convention that we have always
used a name like Tommy to refer to a dog.
For analyzing direct relations like X is Y, the semantic analyzer determines the
group to which both the individuals belong to, compares them, and accordingly
Ontology Based Object-Attribute-Value Information Extraction 617
updates the O-A-V triplet based on previous occurance of both the objects and
their attribute value, refer Fig. 6 and Fig. 7.
(a) ner for unknown entities (b) ner for known entities
the individuals classified into groups along with their relations and a hierarchy,
a taxonomy is developed.
we are bound to various websites based on their reputation and neglect valuable
information that might we be available with other web pages.
We need a way so that the user can actually get to know the type of content
available in a web page without having to go through the complete list of web
pages provided by the search engine and also widen his knowledge about the
web links he should visit.
Fig. 9. Proposed Architecture for Ontology Based Search Engine providing the user
with a meaningful insight to the content within each Web Result by providing Seman-
tically extracted Triplets rather than a random sentence from the document containing
the queried keywords
each web link will the provide user with valuable insight and also helps save time.
Later on, we would try providing semantic links within the web documents so
that they could be used for integration as well as validation of the relations
between various web resources.
References
1. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Scientific Ameri-
can 284(5), 28–37 (2001)
2. Bizer, C., Heath, T., Berners-Lee, T.: Linked data-the story so far. International
Journal on Semantic Web and Information Systems (IJSWIS) 5(3), 1–22 (2009)
3. Kompridis, N.: So we need something else for reason to mean. International Journal
of Philosophical Studies 8(3), 271–295 (2000)
4. Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge
Acquisition 5(2), 199–220 (1993)
5. Sowa, J.F.: Knowledge representation: logical, philosophical, and computational
foundations, vol. 13. Brooks/Cole, Pacific Grove (2000)
6. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to
wordnet: An on-line lexical database. International Journal of Lexicography 3(4),
235–244 (1990)
7. McBride, B.: The resource description framework (RDF) and its vocabulary descrip-
tion language RDFS. In: Handbook on Ontologies, pp. 51–65 (2004)
8. Singhal, A.: Introducing the knowledge graph: things, not strings. Official Google
Blog (2012)
9. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: Dbpedia:
A nucleus for a web of open data. In: 6th International Semantic Web Conference,
2nd Asian Semantic Web Conference, pp. 722–735 (2007)
Effects of Robotic Blinking Behavior for Making
Eye Contact with Humans
Abstract. Establishing eye contact with the target human plays a central role
both in human-human and human-robot communications. A person seems that
s/he meet eye contact with the other when they are looking at each other eyes.
However, such a looking behavior alone is not enough, displaying gaze aware-
ness behavior also necessary for making successful eye contact. In this paper,
we proposed a robotic framework that can establish eye contact with the human
in terms two phases: gaze crossing and gaze awareness. We evaluated a primi-
tive way of displaying robot’sgaze behaviors to its partner and confirmed that
the robot withsuch gaze behaviors could give stronger feeling of being lookedat
and better feeling of making eye contact than ones with absent of such beha-
viors.A preliminary experiment with a robotic head shows the effectiveness of
using blinking behavior as a gaze awareness modality for responding partner.
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 621
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_71, © Springer International Publishing Switzerland 2014
622 M.M. Hoque, Q.D. Hossian, and K. Deb
looked at by the other [4]. Yoshika et al. [5] also mentioned that simply staring is not
always sufficient for a robot to make someone feel that they are being looked at.
Thus, after crossing the gaze with the intended recipient the robot should interpret the
human looking response and display gaze-awareness which is an important behavior
for humans to feel that the robot understands his/her attentional response. In order to
display awareness explicitly the robot should use some verbal ore non-verbal actions.
Therefore, in order to set up an eye contact event with the human, robots should satis-
fy two important conditions: (i) gaze crossing, and (ii) gaze awareness. In this work,
we develop a robotic head that can detect, and track the human head and body. By
using the information of head and body of the human, the robotic head adjust its head
position to meet the face-to-face orientation with him/her. This will achieve the first
condition. After meeting in face-to-face, the robot displays its eye blinking actions
which satisfy the second condition.
2 Related Work
Several robotic systems were developed to establish eye-contact with the humans
which are broadly classified in two ways: gaze crossing and gaze crossing with gaze
awareness. In gaze crossing based, robots are supposed to make eye contact with
humans by turning their eyes (cameras) toward the human faces [6], [7].All of these
studies focus only on the gaze crossing function of the robots as making its eye con-
tact capability and gaze awareness functions are absent. Several robotic systems were
incorporates gaze awareness functions by facial expression (i.e., smiling) [8]. To pro-
duce smiling expression, they used a flat screen monitor as the robot’s head and
display 3D computer graphics(CG) images. A flat screen is unnatural as a face. More-
over, these models used to produce the robot’s gaze behavior are typically not reac-
tive to the human partner’s actions. Yoshikawa et al. [5] used a communication robot
to produce the responsive gaze behaviors of the robot. This robot generates a follow-
ing response and an averting response against the partner’s gaze. They showed that
the responsive robotic gazing system increases the feeling of people of being looked
at. However, it is unknown how the robot produce gaze awareness behavior while it
deals with the people. Moreover, the robotic heads that are used in previous studies
were mechanically very complex and as such expensive to design, construct and
maintain. A recent work that used a robot Simon to produce the awareness function
[9]. The Simon blink its ear when it hearing an utterance. Although they consider the
single person interaction scenario, they did not used ear blinks as a gaze awareness
purpose rather use to create interaction awareness.
3 System Overview
For the HRI experiments we developed a robotic head (Fig. 1). This figure shows a
prototype of the robotic head and the corresponding outputs of its several software
modules. The head consists of a spherical 3D mask, an LED projector (3M pocket
projector, MPro150), a laser range sensor (URG-04LX by Hokuyo Electric Machi-
nery), three USB cameras (Logicool Inc., Qcam), and a pan-tilt unit (Directed Percep-
tion Inc., PTU-D46). The LED projector projects CG-generated eyes on the mask.
Effects of Robotic Blinking Behavior for Making Eye Contact with Humans 623
In the current implementation, one USB camera is attached to the robot’s head (as
shown in Fig. 1). The other two cameras and the laser sensor are affixed to the tri-
pods, placed at an appropriate position for observing participants’ heads as well as
their bodies.
Fig. 1. A prototype robotic head that consists of five modules: HTM (head tracking module),
SRM (situation recognition module), BTM (body tracking module), EBM (eye blinks module),
and RCM (robot control module)
The system utilizes a general purpose computer (Windows XP). The proposed sys-
tem has five main software modules: HTM, BTM, SRM, ECM, and RCM. To detect,
track, and compute the direction of the participant’s head in real time (30 frame/sec),
we use FaceAPI by Seeing Machines Inc. It can measure 3D head direction within a
30of error. The human body model is consequently represented with the center coor-
dinates of an ellipse [x, y] and its principle axis direction (θ). These parameters are
estimated in each frame by a particle filter framework. Using a laser sensor, the BTM
can locate a human body within errors of 6cm for position, and 60 for orientation. To
assess the current situation (i.e., where the human is currently looking), one observes
the head direction estimated by the HDTM. From the results of the HDTM, the SRM
recognizes the existing viewing situation (with 99.4% accuracy) and the direction of
the relevant object in terms of yaw (α), and pitch (β) movements of the head by using
a set of predefined rules. For each rule, we set the values of (α), and (β) by observing
several experimental trials. For example, if the current head direction (of the human
with respect to the robot) is within -100≤α≤+100 and -100≤β≤+100 and remains in the
same direction for 30 frames, system recognizes the situation as the central field of
view. The results of the SRM are sent to the PTUCM to initiate the eye contact
process, and the robot turns toward the human based on the results provided by the
BTM. The robot considers that the participant has responded against its actions if s/he
looks at the robot within the expected time-frame. If this step is successful, the FDM
624 M.M. Hoque, Q.D. Hossian, and K. Deb
detect the participant’s face. After detecting his/her face, the FDM sends the results to
the EBM for exhibiting eye blinks to let the human know that the robot is aware of
his/her gaze. The EBM generates the eye blinks to create the feeling of making eye
contact. All the robot’s head actions are performed by the PTU, with the actual con-
trol signal coming from several modules. The details description of the robotic head
has described in [10].
4 Experiments
The way one person looks at another and how often they blink seems to have a big
impact on what kind of impression they make on others. For example, it has been
pointed out that people’s impressions of others are affected by the duration they are
looked at, and the rate of blinking [11]. Robots need to able not only to detect human
gaze but also to accurately display their gaze awareness that can be correctly inter-
preted by humans. We propose eye blinking action for the robot to create such gaze
awareness function. When using blinking, we need to consider appropriate number of
blinking or blink duration to effectively convey gaze awareness. We performed two
experiments to verify the effect of eye blinks and blink duration of the robot to estab-
lish eye contact with the human.
Fig. 2. Experimental scenario: (a) Schematic setting, (b) a scene of the experiment
Conditions: To verify the effect of eye blinks, we prepared the following two con-
ditions: (i) Gaze averting robot (GAR):A gaze avoidance phenomenon occurs when a
human (H1) avoids looking at another human (H2). If the participant looking at the
0
robot it avoids his/her gaze by rotating its head to another direction by −120 at a pan
0
speed of 180 /second.(ii) Gaze crossing without blinksrobot (GC+WBR): The robot
turns its head toward the participant from its initial position. The robot recognizes
his/her face while s/he is looking at it, waits about five seconds, and then turns its
head toward another direction.(iii) Gaze crossing with blinks robot (GC+BR): The
robot turns its head toward the participant. The robot recognizes his/her face while
s/he is looking at it. After detecting the face of the participant, the robot starts blink-
ing its eyes about five seconds, and then turns its head toward another direction.
Measurements: We measured the following two items in the experiment:(A) Im-
pression of robot: We asked participants to fill out a questionnaire for each condition
(after three interactions). The measurement was a simple rating on a Likert scale of 1
to 7, where 1 stands for the lowest and 7 for the highest. The questionnaire had the
following two items: (i) The feeling of looking at: Did you feel like the robot was
looking at you? (ii) The feeling of making eye contact: Did you think that behaviors of
the robot were created your feeling of making eye contact with it?(B) Gazing time:We
measure the total time that are spent by the participants to gazing at the robot in each
method. This time is measured from the beginning of gaze crossing action of the robot
to the end of the participant’s looking at it before turning head to another direction by
observing the experimental videos.
Predictions: As previous studies suggested, not only gaze crossing but also gaze
awareness is an important factor for making eye contact. Simply staring is not always
sufficient for a robot to make someone feel being looked at. Although the GC+WBR
try to make eye contact using gaze crossing function, it lacks to create gaze awareness
function, which is implemented in the GCR. On the other hand, both of these func-
tions are absent in GAR. Moreover, participants’ eyes may be coupled with the ro-
bot’s eye during blinking. Based on this consideration, we predicted the following:
Prediction 1:Robots with a gaze crossing and eye blinks (GC+BR) functions will give
their partners a stronger feeling of being looking at and making eye contact than ro-
bots that do not include these functions (i.e., GC+WBR and GAR).Prediction
2:Participants will spend more time on gazing at the robot with gaze crossing and
blinking function (GC+BR) than that of the robots without gaze crossing and blinking
actions (GAR and GC+WBR).
626 M.M. Hoque, Q.D. Hossian, and K. Deb
Results: The experiment had a within-subject design, and the order of all experi-
mental trials was counterbalanced. Every participant experienced all three conditions.
We conducted the repeated measure of analysis of variance (ANOVA) for all meas-
ures. Fig.3(a) shows the results of the questionnaire assessment.
Fig. 3. Evaluation results: (a) participants responses on each question. Error bars indicates the
standard deviation, (b) total time spent on gazing
Concerning the feeling of looking at, ANOVA analysis shows that the differences
between conditions were statistically significant [F(2,70)=369.8, p<0.01, η =0.88].We
2
conducted multiple comparisons with the Bonferroni method that showed significant
differences between robots GAR and GC+BR (p<0.001), and between robots GC+BR
and GC+WBR (p=0.02). Concerning the feeling of making eye contact, a significant
main effect was found [F(2,70)=288.1,p<0.0001, η = 0.84]. Multiple comparisons
2
with the Bonferroni method also revealed significant differences between pairs
(GC+BR vs. GAR: p<0.001, and GC+BR vs. GC+WBR: p=0.0005). From these sta-
tistical analyses, we confirmed that robots with eye blinks succeeded in giving partic-
ipants a stronger feeling of being looked at and making eye contact than those without
eye blinks. These results verified prediction 1.
In order to evaluate the gazing time, we observed a total of 108(36 × 3) interaction
videos for all robots. Fig. 3(b) summarizes the mean values of time that the partici-
pants are spent on gazing at the robots. ANOVA analysis showed that there are signif-
icant differences that the participants spending to gaze at the robot in each method
[F(2,70)=374.2, p<0.0001, 2=0.88]. Multiple comparisons with the Bonferroni me-
thod revealed significant differences between pairs (GC+BR vs. GAR: p<0.001, and
GC+BR vs. GC+WBR: p<0.0001). Results also indicate that the participant looks
significantly longer in proposed method (4.52s) than the other methods (1.75sfor
GAR and 3.75s for GC+WBR). Thus, the prediction 2 has been supported.
Ten new participants participated in this experiment. Their average age was 26.3
years (SD=3.68). All participants were graduate students at Saitama University. The
purpose of this experiment is to evaluate the appropriate blink duration for creating
the feeling of gaze awareness.
Experimental Design: The experimental setting was the same as describe in Sec-
tion 4.1. We set the robot to perform eye blinking after detecting the face of the hu-
man according to the conditions. The experiment had a within-subject design, and the
Effects of Robotic Blinking Behavior for Making Eye Contact with Humans 627
these results. Concerning 1B/s, multiple comparisons with Benferroni method show
that there are significant main effect between conditions (such as 3s vs. 1s: p<0.0001,
3s vs. 2s: p<0.001, 3s vs. 4s: p=0.003, and 3s vs. 5s: p<0.001). Multiple comparisons
with Benferroni method was conducted among blink duration [3s] and revealed sig-
nificant differences between 3B/s and 2B/s (p<0.001), and between 3B/s and 1B/s
(p<0.0001) respectively. For 1B/s, participants rated more in 3s condition (M=5.7)
than in all other conditions and blinking rate [Fig. 4 (a)]. Thus, we may use 3s blink
duration at the rate of 1B/s as the preferred blinking action to generate the gaze
awareness behaviors of the robot.
Fig. 4. Mean values of participant’s impression on different blinking rate and duration
628 M.M. Hoque, Q.D. Hossian, and K. Deb
5 Conclusion
Blinking actions strengthens the feeling of being looked at and it can be used to convey
an impression more effectively and colorfully understanding of human social behavior.
Experimental results have also confirmed eye blinking actions proved helpful to relay
the target participant that the robot was aware of his/her gaze. Participant’s eyes couple
with the robot’s eyes during blinking causes they spent more time on gazing at the ro-
bot. This behavior may help the human for identifying the attention shift focus of ro-
bots. However, without blinking the robot may fail to create the feeling of eye contact
being established, due to the lack of gaze awareness function. Our preliminary studies of
eye contact have confirmed that the robot with eye blinking capability (about 3 seconds
duration with 1 blinks/s blinking rate) is effective to create such feeling that a human
makes eye contact with it. To generalize the finding, we will incorporates this robotic
head in a humanoid robot and evaluate the effectiveness of the proposed robotic frame-
work in real scenarios. These are left for future work.
References
1. Argyle, M.: Bodily Communication. Routledge (1988)
2. Kleinke, C.: Gaze and Eye Contact: A Research Review. Psychological Bulletin100(1),
78–100 (1986)
3. Farroni, T., Mansfield, E.M., Lai, C., Johnson, M.H.: Infant Perceiving and Acting on the
Eyes: Tests of an Evolutionary Hypothesis. Journal of Exp. Child Psy. 85(3),199– 212
(2003)
4. Cranach, M.: The Role of Orienting Behavior in Human Interaction, Behavior and Envi-
ronment.The Use of Space by Animals and Men, pp. 217–237. Plenum Press (1971)
5. Yoshikawa, Y., Shinozawa, K., Ishiguro, H., Hagita, N., Miyamoto, T.: The Effects of
Responsive Eye Movement and Blinking Behavior in a Communication Robot. In: Pro-
ceedings of International Conference on Intelligent Robots and Systems, pp. 4564–4569
(2006)
6. Kanda, T., Ishiguro, H., Imai, M., Ono, T.: Development and Evaluation of Interactive
Humanoid Robots. In: Proceedings on Human Interactive Robot for Psychological
Enrichmen, vol. 92(11), pp. 1839–1850 (2004)
7. Mutlu, B., Kanda, T., Forlizzi, J., Hodgins, J., Ishiguro, H.: Conversational Gaze Mechan-
isms for Humanlike Robots. ACM Transactions on Interactive Intelligent Systems1(2)
(2012)
8. Miyauchi, D., Kobayashi, Y., Kuno, Y.: Bidirectional eye contact forhuman-robot com-
munication. IEICE Trans. of Info.& Syst. 88-D(11), 2509–2516 (2005)
9. Huang, M., Thomaz, L.: Effects of Responding to, Initiating and Ensuring Joint Attention
in Human-Robot Interaction. In: Proceedings of IEEE International Symposium on Robot
and Human Interactive Communication, pp. 65–71 (2011)
10. Hoque, M.M., Deb, K., Das, D., Kobayashi, Y., Kuno, Y.: Design an Intelligent Robotic
Head to Interacting with Humans. In: Proceedings of International Conference on Comput-
er and Information Technology, pp. 539–545 (2012)
11. Exline, R.V.: Multichannel Transmission of Nonverbal Behavior and the Perception of
Powerful Men: The Presidential Debates of 1976. Power, Dominance, and Nonverbal Be-
havior. Springer-Verlag Inc. (1985)
Improvement and Estimation of Intensity of Facial
Expression Recognition for Human-Computer
Interaction
Abstract. With the ubiquity of new information technology and media, more
effective and friendly methods for human computer interaction (HCI) are being
developed. The first step for any intelligent HCI system is face detection and
one of the most friendly HCI systems is facial expression recognition. Although
Facial Expression Recognition for HCI introduces the frontiers of vision-based
interfaces for intelligent human computer interaction, very little has been ex-
plored for capturing one or more expressions from mixed expressions which are
a mixture of two closely related expressions. This paper presents the idea of
improving the recognition accuracy of one or more of the six prototypic expres-
sions namely happiness, surprise, fear, disgust, sadness and anger from the mix-
ture of two facial expressions. For this purpose a motion gradient based optical
flow for muscle movement is computed between frames of a given video se-
quence. The computed optical flow is further used to generate feature vector as
the signature of six basic prototypic expressions. Decision Tree generated rule
base is used for clustering the feature vectors obtained in the video sequence
and the result of clustering is used for recognition of expressions. Manhattan
distance metric is used which captures the relative intensity of expressions for a
given face present in a frame. Based on the score of intensity of each expres-
sion, degree of presence of each of the six basic facial expressions has been
determined. With the introduction of Component Based Analysis which is basi-
cally computing the feature vectors on the proposed regions of interest on a
face, considerable improvement has been noticed regarding recognition of one
or more expressions. The results have been validated against human judgement.
1 Introduction
In the area of Human Computer Interaction (HCI), face is considered to be most im-
portant object to model the human behavior precisely. A facial expression is indeed
an important signature of human behavior caused due to change in the facial muscles
arrangement in response to a person’s internal emotional states. For psychologists and
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 629
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_72, © Springer International Publishing Switzerland 2014
630 K. Chanda et al.
behavioral scientists, facial expression analysis and their quantification has been an
active research topic since the seminal work of Darwin [1]. Ekmanet al. [2] reported
their result on categorization of human facial expressions as happiness, surprise, fear,
disgust, sadness and anger. They further proposed that facial expressions are results of
certain Facial Actions and they introduced the facial action coding system (FACS) [3]
designed for human observers to detect subtle changes in facial appearance. However,
much research has not been observed in recognizing accurately, one or more of the six
basic expressions from two mixed expressions. In real life problems the confusion
between two expressions poses an inherent problem of identifying correctly the mixed
expressions. Improvement in this aspect based on Component Analysis introduces
also the subsequent improvement in facial expression intensity. Recent psychological
studies [4] suggests that to understand human emotion only classifying expression
into six basic categories is insufficient while the detection of intensity of expression
and percentage of each expression in case of mixed mode expression is important in
the practical application of human computer interaction.
In the proposed framework we developed a method to extract flow of moving fa-
cial muscles or organs based on gradient based optical flow [5] from video sequences
and analyze it with some specific parameters to recognize facial expressions. For this
we have ensured adequate region of interests based on aspect ratio of face and head
tilt within ±15º, to enhance quality of features, most suitable, for estimation of expres-
sion intensity. After the generation of optical flow feature vector we used Decision
Tree based classifier [6] used to recognize one or more expressions for a given frame
in a video sequence. This coupled with the computation of Manhattan distance func-
tion gave us the measurement of the intensity of expressions. The Manhattan distance
between two vectors is the sum of the differences of their corresponding components.
In our work for each frame of a video sequence the components based on projections
of optical flow feature vectors for any expression are considered.
which imposes smoothness constraint on the problem. The equation obtained from
smoothness constraint is given by,
∇2u + ∇2v = 0 (2)
These two constraint equations are solved iteratively to obtain numerical estimate
of optical flow velocity components. In the current work, localized optical flow vector
is computed within the 13 windows mentioned as the regions of interest. The direc-
tion of orientation of the windows with global horizontal axis is taken as the window
symmetry axis. Projection Pij of optical flow vectors Uij(X) is taken on the long
symmetry axis of ith window for the jth pixel: Pij=Uij(X).ni where
Uij(X)=(uij(x,y),vij(x,y)) is the optical flow vector in jth X=(x,y) of ith window com-
puted from two successive significant frames.ni is the unit vector along the axis of ith
window (i= 0,...,12).After computing the optical flow vectors and their projections we
compute the mean and standard deviation of the projected flow vectors for consecu-
tive frames. With these we generate the feature vector to represent the basic expres-
sions.These feature vectors are utilized for the recognition of the six expressions by
using a rule base generated by a trained decision tree.
developed algorithm. We observed that for some group of mixed expressions the re-
sults were most confusing while for others the results were less confusing. Ekman and
Friesen [3] has suggested particular region of interest for specific expression. From
the current literature it has been found that variations of region of interest can reduce
the confusion between two expressions. These variations of region of interest have
been selected depending on the optical flow vector. Based on our work we have
shown that the region of interests, depending on the optical flow vectors, can be mod-
ified. The proposed regions of interest are arrived based on the following observa-
tions. At first gradient based optical flow is computed globally on the face region
comparing the neutral face and the face in the apex of the expression. Based on the
spread of the magnitude and direction of optical flow vectors done locally within
certain regions located in forehead, two eye brows, two eyes, nose, cheek, mouth and
chin we obtain the proposed regions. It may be noted that maximum change in frame
intensity occurs due to movement of a specific muscle group for displaying all six
basic expressions [8]. The results shown with the help of the confusion matrix ex-
plains improved classification for the two expressions creating most confusion.
The survey to study the human recognition of facial expressions was conducted on
200 persons both male and female and age varying from 18 to 60 years. Each of the
persons was shown six video sequences created by CDAC, Kolkata which contained
one of the six basic expressions. Each person gave a score within a scale of 1 to 10
based on the intensity of expression. The scores are shown in Fig. 1. In this figure a
table illustrating the score given by each of 200 persons to whom six video sequences
were displayed. An intense gray scale value denotes a high score for a particular facial
expression while light gray value denotes low score. From the figure it is obvious that
happiness is best distinguished from the other facial expressions. Surprise gets most
confused with fear, sometimes with sadness or anger and even less with disgust and
happiness. It also known that fear is the hardest to discriminate from the other facial
expressions. Fear often gets confused with surprise and sometimes with sadness or
anger. Disgust gets also confused with fear and sometimes surprise. Anger also some-
times gets confused with sadness and fear or surprise. Sadness gets little confused
with disgust and fear, but gets highly confused with anger.
From the above observations it is found that there are group of two mixed expres-
sions which are most confusing. They are sadness with anger, anger with disgust,
disgust with fear, fear with surprise and surprise with fear. Also another set of group
of expressions which have less confusion within them are sadness with disgust, sad-
ness with fear, anger with sadness, anger with fear, anger with surprise, disgust with
surprise, fear with sadness, fear with anger, surprise with sadness, surprise with anger,
surprise with disgust and surprise with happiness. The established literature suggests
that there are certain components (region of interests) from the facial image that are
responsible for projecting one or more expressions. In our experiment, we categorize
the class and intensity of mixed expressions by automatically measuring the optical
flow vector generated due to mutual change of intensity pattern for a current frame as
compared with the neutral one. While carrying out experiments it has been found that
the set of group of two mixed expressions which are most confusing are appropriately
justified.
Improvement and Estimation of Intensity of Facial Expression Recognition 633
The region of interest responsible for creating confusion between different expres-
sions are first identified using the spread of the optical flow vector in that region.
Fig. 1. The table shows result of humans specifying the facial expression visible in each of the
video sequences containing one of the six basic expressions of the CDAC Kolkata Facial Ex-
pression Database
Table 1. Table 2.
Table 3. Table 4.
However considering the proposed region of interest for C-means clustering the
performance has been increased from 28 to 34 for sadness dominant expression and
from 27 to 32 for anger dominant expression. Similar type of confusion matrix for
most confusing pair of expressions (as shown from Table 2 to Table 4) reveals that
the performance of expression recognition increases by suitable selection of region of
interest as proposed in this paper.
Instead of two dimensions, in this paper we have considered features having 26-
dimensions, such as a = (x1, x2, x3… x26) and b = (y1, y2, y3… y26) then the above
equation can be generalized by defining the Manhattan distance between a and b as
MH (a, b) = | x1-y1| + | x2-y2|+ … + | x26-y26| = ∑ | xi - yi|.
In our work the Manhattan distance is computed at each node which denotes a test
on an attribute value to give an absolute difference. The sum of the differences
through those nodes for which a particular expression is estimated gives the total
Manhattan distance which is normalized to give an estimate of intensity of expression
for that particular frame with that of a neutral frame.
Table 5. Table 6.
7 Conclusion
In this paper, a Component Based Analysis on the regions of interest of a face is in-
troduced. This work contributes to the recognition of mixed expressions and the con-
fusion involved in the video. The system finds use in the treatment of mental and
emotional disorders (patients suffering from autism with low IQ) through the use of
psychological techniques designed to encourage communication of conflicts and in-
sight into problems, with the goal being relief of symptoms, changes in behavior lead-
ing to improved social and vocational functioning and personality growth. For usage
as a psychotherapeutic tool it is essential to capture accurately one or more expres-
sions from mixed expressions and also to keep records of intensity of expression.
References
1. Darwin, C.: The expression of Emotions in Man and Animals. Univ. Chicago Press (1872)
2. Ekman, P.: Facial expressions and emotion. Amer. Psychol. 48, 384–392 (1978); Consult-
ing Psychologists Press(1978)
3. Ekman, P., Friesen, W.V.: Facial Action Coding System (FACS): Manual. Pal Alto, Calf.
4. Ambadar, Z., Schooler, J., Cohn, J.F.: Deciphering the enigmatic face, the importance of
facial dynamics in interpreting subtle facial expression. Psychological Science (2005)
5. Horn, B.K.P., Schunck, B.G.: Determing optical flow. Artificial Intelligence 17 (1981)
6. Dunn, J.C.: A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact
Well-Separated Clusters. Journal of Cybernetics 3, 32–57 (1973)
7. Gupta, G.K.: Introduction to data mining with case studies. Prentice Hall of India Private
Limited (2006)
8. Reilly, J., Ghent, J., McDonald, J.F.: Investigating the dynamics of facial expression. In:
Bebis, G., Boyle, R., Parvin, B., Koracin, D., Remagnino, P., Nefian, A., Meenakshisunda-
ram, G., Pascucci, V., Zara, J., Molineros, J., Theisel, H., Malzbender, T. (eds.) ISVC 2006.
LNCS, vol. 4292, pp. 334–343. Springer, Heidelberg (2006)
9. Bezdak, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum
Press, New York (1981)
Cognitive Activity Recognition
Based on Electrooculogram Analysis
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 637
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_73, © Springer International Publishing Switzerland 2014
638 A. Banerjee et al.
cognitive contexts. Three major types of eye movements, namely, saccades, fixations
and blinks are important in conveying information regarding a person’s current activi-
ty [2]. The human eyes move constantly in saccades to build a ‘mental map’ of the
visuals that is seen. Fixations are present in between saccades where the eyes focus on
a particular location. Blinks are short duration pulses of relatively higher magnitude
that are heavily characterized by environmental conditions and mental workload. In
[2], features based on these three types of eye movements have been used to recog-
nize cognitive context from EOG signals achieving average precision of 76.1%. In [8]
reading activity based on saccades and fixations has been recognized with a recogni-
tion rate of 80.2%. In our previous works [9-10] we have classified EOG to recognize
directional eye movements from some standard signal features.
In the present work, eye movements are recorded using a two-channel Electroocu-
logram signal acquisition system developed in the laboratory by surface electrodes
from ten subjects while they performed eight different activities. The acquired EOG
signals are filtered and four standard signal features, namely, Adaptive Autoregres-
sive Parameters, Hjorth Parameters, Power Spectral Density and Wavelet Coefficients
are extracted. These feature spaces are used independently as well as in combinations
to classify the EOG signals using a Support Vector Machine with Radial Basis Func-
tion kernel, successfully recognizing eight different cognitive activities.
The rest of the paper is structured as follows. Section 2 explains principles and the
methodology followed in the course of the work. Section 3 covers the experiments and
results. Finally in Section 4 the conclusions are drawn and future scopes are stated.
electrode, which becomes more positive, with zero potential at the electrode below
the eye, and vice versa resulting negative and positive output voltage respectively.
The value of EOG amplitude will be positive or negative depending upon the direc-
tion in which the eye is moved. Similarly the left and right eye movements can also be
explained. Blinks are short duration pulses, having comparatively high amplitude.
The recording of the EOG signal has been done through a two channel system de-
veloped in the laboratory using five Ag/AgCl disposable electrodes at a sampling
frequency of 256 Hz. Two electrodes are used for acquiring horizontal EOG and two
for vertical EOG, while one electrode acts as the reference as shown in Fig. 2(a). The
frequency range of the acquired EOG signal is below 20 Hz and the amplitude lies
between 100-3500 mV. The circuit designed for signal acquisition has been shown in
Fig. 2(b). The signal collected from the electrodes is fed to an instrumentation am-
plifier, implemented by IC AD620, having a high input impedance and a high CMRR.
This output is given to a second order low pass filter with a cut off frequency of
20Hzto eliminate undesirablenoise. Different stages of filters, implemented using IC
OP07s, provide various amounts of gain. Amplifier provides a gain of 200 and a gain
of 10 is provided by the filter. Thus an overall gain of 2000 is obtained. For any kind
of bio-signal acquisition, isolation is necessary for the subject’s as well as the instru-
ment’s safety. Power isolation is provided by a dual output hybrid DC-DC converter
(MAU 108) and signal isolation is achieved by optically coupling the amplifier output
signal with the next stage through HCNR 200.For conversion of the signal in digital
format, an Analog to Digital Converter is necessary.The Electrooculogram data has
been acquired in the LabView 2012 platform for processing in the computer using
National Instruments 12-bit ADC.
To eliminate undesirable noise and obtain EOG in the frequency range of 0.1 to
15Hz, the range where maximum information is contained, we implement band pass
filtering. A Chebyshev band pass filter of order 6 has been used.
(a) (b)
Fig. 2. Data Acquisition System showing (a) Placement of Electrodes and (b) Acquisition
Circuit Snapshot
y k = a1,k * y k −1 + ... + a p ,k * y k − p + xk
(1)
1 t −τ
x(t ).ψ
X WT (τ , s ) = * (3)
dt
s s
Eq. (3) defines the wavelet transform for a signalx(t) and a mother waveletψ(t),
shifted by τ amount and scaled by s=1/frequency. Eq. (3) can be sampled to produce
the Wavelet series. This evaluation takes up large computation times for higher reso-
lutions.In Discrete Wavelet Transform the signals are passed through high and low
pass filters in several stages. At each stage, each filter output is downsampled by two
to produce the approximation coefficient and the detail coefficient. The approxima-
tion coefficient is then decomposed again, to get the approximation and detail coeffi-
cients of the subsequent stages.In the present work,Haar mother wavelet and the
fourth level approximate coefficients of discrete wavelet transform has been used.
Power Spectral Density [14] of a wide sense stationary signal x(t)is computed from
the Fourier transform of its autocorrelation function, given by S(w) in (4), where E
denotes the expected value and T denotes the time interval.
T T
1
S ( w) =
T 00 E[ x* (t ) x(t ′)]eiw(t −t ′ ) dtdt ′ (4)
For a time varying signal such as EOG PSD should be evaluated by segmenting the
complete time series. In the present work PSD has been evaluated using Welch Me-
thod [15] that splits the signal into overlapping segments, computes the periodograms
of the overlapping segments from their Fourier Transforms, and averages the resulting
periodograms to produce the power spectral density estimate. PSD was computed in
Cognitive Activity Recognition Based on Electrooculogram Analysis 641
the frequency range of 1-15Hz using Welch Method with 50% overlap between the
signal segments using a Hamming window. The resulting feature vector has dimen-
sions of 15 for each channel of EOG data in the integer frequency points between 1
and 15 Hz.
Activity, Mobility and Complexity, collectively called Hjorth Parameters [16] de-
scribe a signal in terms of time domain characteristics.
2.3 Classification
Classification has been carried out using the well-known binary classifier, Support
vector machine with Radial Basis Function kernel (SVM-RBF) [17]. For SVM-RBF,
the width of the Gaussian kernel is taken as 1 and the one-against-all (OVA) classifi-
cation accuracies are computed with a particular activity as a class and all the other
activities comprising another class.
(a) (b)
Fig. 3. Filtered EOG signals from a particular Subject while Subject is (a) Writing and (b)
Watching Video
The classifier is trained to recognize these eight activities using features from 60
seconds EOG data of each activity per subject. In the second phase the trained clas-
sifier is tested with unknown test stimuli. Classification accuracies are computed from
respective confusion matrices. The filtered EOG signal corresponding to 10 seconds
of data acquired from Subject 1 for different activities is shown in Fig. 3.
The results of classification in terms of classification accuracy and computation
time (including feature extraction and classification) have been tabulated in Table 1
and 2, mentioning each feature vector dimension (in parenthesis), for two channels of
EOG and indicating the maximum accuracy in bold for each class.
642 A. Banerjee et al.
It is observed that using single feature spaces Hjorth Parameters provide highest
average accuracy of 88.85% over all classes of activities. Thus other features are
combined with Hjorth Parameters thereby showing a significant increase in classifica-
tion accuracy as is evident from Table 2. The combination of all the four features
produces the highest mean classification accuracy of 90.39% in 41.60 seconds on an
average over all classes of activities.
Cognitive Activity Recognition Based on Electrooculogram Analysis 643
References
1. Davies, N., Siewiorek, D.P., Sukthankar, R.: Special issue on activity based computing.
IEEE Pervasive Computing 7(2) (2008)
2. Bulling, A., Ward, J.A., Gellersen, H., Troster, G.: Eye Movement Analysis for Activity
Recognition. In: Proceedingsof 11th International Conference on Ubiquitous Computing,
pp. 41–50. ACM Press (2009)
3. Deng, L.Y., Hsu, C.L., Lin, T.C., Tuan, J.S., Chang, S.M.: EOG-based Human–Computer
Interface system development. Expert Systems with Application 37(4), 3337–3343 (2010)
4. Arden, G.B., Constable, P.A.: The electro-oculogram. Progress in Retinal and Eye Re-
search 25(2), 207–248 (2006)
5. Stavrou, P., Good, P.A., Broadhurst, E.J., Bundey, S., Fielder, A.R., Crews, S.J.: ERG and
EOG abnormalities in carriers of X-linked retinitis pigmentosa. Eye 10(5), 581–589 (1996)
6. Barea, R., Boquete, L., Mazo, M., López, E., Bergasa, L.M.: EOG guidance of a wheel-
chair using neural networks. In: IEEE International Conference on Pattern Recognition, pp.
668–671 (2000)
7. Banerjee, A., Chakraborty, S., Das, P., Datta, S., Konar, A., Tibarewala, D.N., Janartha-
nan, R.: Single channel electrooculogram (EOG) based interface for mobility aid. In: 4th
IEEE International Conference on Intelligent Human Computer Interaction (IHCI), pp. 1–6
(2012)
8. Bulling, A., Ward, J.A., Gellersen, H., Tröster, G.: Robust Recognition of Reading Activi-
ty in Transit Using Wearable Electrooculography. In: Indulska, J., Patterson, D.J., Rodden,
T., Ott, M. (eds.) PERVASIVE 2008. LNCS, vol. 5013, pp. 19–37. Springer, Heidelberg
(2008)
9. Banerjee, A., Konar, A., Janarthana, R., Tibarewala, D.N.: Electro-oculogram Based Clas-
sification of Eye Movement Direction. In: Meghanathan, N., Nagamalai, D., Chaki, N.
(eds.) Advances in Computing & Inf. Technology. AISC, vol. 178, pp. 151–159. Springer,
Heidelberg (2012)
644 A. Banerjee et al.
10. Banerjee, A., Datta, S., Pal, M., Konar, A., Tibarewala, D.N., Janarthanan, R.: Classifying
Electrooculogram to Detect Directional Eye Movements. First International Conference on
Computational Intelligence: Modeling Techniques and Applications. Procedia Technolo-
gy 10, 67–75 (2013)
11. Roy Choudhury, S., Venkataramanan, S., Nemade, H.B., Sahambi, J.S.: Design and De-
velopment of a Novel EOG Biopotential Amplifier, International Journal of Bioelectro-
magnetism 7(1), 271–274 (2005)
12. Schlögl, A., Lugger, K., Pfurtscheller, G.: Using adaptive autoregressive parameters for a
brain-computer-interface experiment. In:Proceedings of the 19th Annual International
Conference of Engineering in Medicine and Biology Society 4, 1533–1535 (1997)
13. Pittner, S., Kamarthi, S.V.: Feature extraction from wavelet coefficients for pattern recog-
nition tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(1), 83–88
(1999)
14. Saa, J.F.D., Gutierrez, M.S.: EEG Signal Classification Using Power Spectral Features and
linear Discriminant Analysis: A Brain Computer Interface Application. In: Eighth Latin
American and Caribbean Conference for Engineering and Technology (2010)
15. Welch, P.: The use of fast Fourier transform for the estimation of power spectra: a method
based on time averaging over short, modified periodograms. IEEE Transactions on Audio
and Electroacoustics 15(2), 70–73 (1967)
16. Hjorth, B.: Time domain descriptors and their relation to a particular model for generation
of EEG activity. CEAN-Computerized EEG Analysis, 3–8 (1975)
17. Gunn, S.R.: Support Vector Machinesfor Classification and Regression. Technical report,
University of Southampton (1998)
Detection of Fast and Slow Hand Movements
from Motor Imagery EEG Signals
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 645
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_74, c Springer International Publishing Switzerland 2014
646 S. Bhattacharyya et al.
In this study, we have separated the fast and slow execution of the left and right
hand movement according to the instructions given to the subject. The complete
scheme for this study is given in Fig. 1. Following the data acquisition of EEG
Detection of Fast and Slow Hand Movements 647
signals according to different mental states, the first step involves the filtering
of the raw signal. In the second step, the features are extracted using Power
Spectral Density leading to the formation of the feature vectors. The feature
vectors are used as inputs to the classifiers to determine the mental state of the
user. Nine right-handed subjects (4 male and 5 female) have participated in this
study over 3 separate sessions organized in 3 different days.
2.3 Pre-processing
Motor imagery signals originates from the primary motor cortex, supplementary
motor area and pre-motor cortex from the brain [2]. Thus, locations in between
the frontal and parietal lobe contains the maximum information on motor related
tasks. For this purpose, we have analyzed the signals acquired from F3, F4,
C3, C4, P3 and P4 electrode locations in this study. Before feature extraction,
we spatially filter the signals from the six electrodes using Common Average
Referencing (CAR) [2] method to reduce the effect of neighboring locations from
these signals. Then, we temporally band pass filter the signals in the bandwidth
of 8-25 Hz using an IIR elliptical filter of order 6, pass-band attenuation of 50 dB
and stop-band attenuation of 0.5 dB. The merit of selecting elliptical filter lies
in its good frequency-domain characteristics of sharp roll-off, and independent
control over the pass-band and stop-band ripples [15].
Based on the timing sequence of the visual cue, sample points from the 2nd
second to the 5th second (3 seconds in total) are extracted from each trial for
data analysis.
N −1
P (wk ) = |DF T (xn )|2 (1)
n=0
In this paper we have selected a hamming window of size 250 over the complete
frequency range and an overlap percentage of 50%. From the whole frequency
Detection of Fast and Slow Hand Movements 649
range of 0 to 125 Hz (since 250 Hz is the sampling frequency), only estimates from
the bands of 8-12 Hz and 16-24 Hz are selected to construct the feature vector.
The final dimension of the feature vector is 14. Fig. 3 illustrates an example of
the power spectral density estimates of fast and slow movement for both limbs
based on the EEG obtained from channel location C3.
Fig. 3. The power spectral density estimates from electrode C3 for the four different
mental tasks performed by a subject
2.5 Classification
The classification scheme implemented in this study is shown in Fig. 4. Two
levels of hierarchical classifiers are implemented, where the first level classifies
between the left and right hand movement (Classifier 1) and the second level
differentiates between fast and slow movement (Classifier 2 and 3). Consequently,
the final output is in the form of right-hand fast movement (RHF), left-hand
fast movement (LHF), right-hand slow movement (RHS) and left-hand slow
movement (LHS).
In this paper, we have used support vector machine (SVM), naïve Bayesian
(NB), linear discriminant analysis (LDA) and k-nearest neighbor (kNN) to dis-
tinguish between the different levels of classifiers [9].
We have also employed Friedman Test [18] to statistically validate our results.
The significance level is set at α =0.05. The null hypothesis here, states that
all the algorithms are equivalent, so their ranks should be equal. We consider
the mean classification accuracy (from Table 2) as the basis of ranking. Table 2
provides the ranking of each classifier algorithm.
Now, from Table 2, the χ2F for the three classifiers (CL1, CL2, CL3) = (16.44,
17.76, 20.61) >= 9.488, So, the null hypothesis, claiming that all the algorithms
are equivalent, is wrong and, therefore, the performances of the algorithms are
determined by their ranks only. It is clear from the table that the rank of NB is
1, claiming NB outperforms all the algorithms by Friedman Test.
Detection of Fast and Slow Hand Movements 651
References
4. Bermudez i Badia, S., Garcia Morgade, A., Samaha, H., Verschure, P.F.M.J.: Us-
ing a Hy-brid Brain Computer Interface and Virtual Reality System to Monitor
and Promote Cortical Reorganization through Motor Activity and Motor Imagery
Training. IEEE Trans. Neural Sys. Rehab. Eng. 21(2), 174–181 (2013)
5. Bordoloi, S., Sharmah, U., Hazarika, S.M.: Motor imagery based BCI for a maze
game. In: 4th Int. Conf. Intelligent Human Computer Interaction (IHCI), Kharag-
pur, India, pp. 1–6 (2012)
6. Millan, J.R., Rupp, R., Muller-Putz, G.R., Murray-Smith, R., Giugliemma, C.,
Tangermann, M., Vidaurre, C., Cincotti, F., Kubler, A., Leeb, R., Neuper, C.,
Muller, K.R., Mattia, D.: Combining brain-computer interfaces and assistive
technogies: State-of-the-art and challenges. Front. Neurosci. 4, 1–15 (2010)
7. Bhattacharyya, S., Sengupta, A., Chakraborti, T., Konar, A., Tibarewala, D.N.:
Automatic feature selection of motor imagery EEG signals using differential evo-
lution and learning automata. Med. & Bio. Eng. & Comp. 52(2), 131–139 (2014)
8. Zhou, W., Zhong, L., Zhao, H.: Feature Attraction and Classification of Mental
EEG Using Approximate Entropy. In: 27th Ann. Int. Conf. Eng. Med. & Bio. Soc.,
pp. 5975–5978 (2005)
9. Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn., pp. 13–22. Aca-
demic Press (2009)
10. Qiang, C., Hu, P., Huanqing, F.: Experiment study of the relation between motion
complexity and event-related desynchronization/synchronization. In: 1st Int. Conf.
Neural Interface & Cont. 2005, pp. 14–16 (2005)
11. Chai, R., Ling, S.H., Hunter, G.P., Nguyen, H.T.: Mental non-motor imagery tasks
classifi-cations of brain computer interface for wheelchair commands using genetic
algorithm-based neural network. In: The 2012 Int. Joint Conf. Neural Networks,
pp. 1–7 (2012)
A Solution of Degree Constrained Spanning Tree
Using Hybrid GA with Directed Mutation
Abstract. It is always an urge to reach the goal in minimum effort i.e., it should
have a minimum constrained path. The path may be shortest route in practical
life, either physical or electronic medium. The scenario can be represented as a
graph. Here, we have chosen a degree constrained spanning tree, which can be
generated in real time in minimum turn-around time. The problem is obviously
NP-complete in nature. The solution approach, in general, is approximate. We
have used a heuristic approach, namely hybrid genetic algorithm (GA), with
motivated criteria of encoded data structures of graph and also directed muta-
tion is incorporated with it and the result is so encouraging, that we are interest-
ed to use it in our future applications.
1 Introduction
Many real-world situations can be described by a set of points, inter-connected
through handshaking theorem (i.e., graphical structure) [1]. Graph plays a vital role in
computer science. The Degree Constrained Spanning Tree (DCST) is one of the clas-
sic combinatorial graph problems. It is a NP-complete problem, so, till now, we do
not have an exact algorithm which solves it in polynomial time. Obviously, we now
seek guidelines for solving this problem around in a way to compromise. The strate-
gies are usually and roughly divided into two classes: Approximation algorithm and
heuristic approaches [2]. Heuristic approaches usually find reasonably good solutions
reasonably fast. GA, is one of the famous heuristic search technique, tries to emulate
Darwin’s Evolutionary process. It deals efficiently, with digital encoded problem, and
DCST problem is in perfect unison. Our proposed solutions may not be optimal but
acceptable in some sense. After running several times, GA will converge each time,
possibly at different optimal chromosomes and the schemata, which promise conver-
gence, are actually indicative of the regions in the search space where good chromo-
somes may be found. So, therefore, the GA is coupled with a random Depth First
Search (DFS) mechanism to find the optimal chromosome in a region. This is a kind
of hybridization of our algorithm. The data structure employed here is graphical
edge-set representation [3]. In addition, it is also shown, how a problem dependent
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 653
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_75, © Springer International Publishing Switzerland 2014
654 S. Sadhukhan, and S.S. Sarma
Problem definition: A simple, symmetric and connected graph G = (V, E), where V is
the set of vertices and E denotes the set of edges with positive integer N • |V|.
| The
problem is to find out a Spanning Tree (ST) T in G, which contains no vertices whose
degree is larger than N, where N> 0 [4].
Narula and Ho, first proposed this problem in designing electrical circuits [5]. Ha-
miltonian path problem is a special edition of this problem, where degree of every
vertices of the spanning tree has the upper bound 2. Finding such a path is related to
the travelling salesman problem. Hamiltonian path, travelling salesman problem all
are the examples of NP-complete problem [4]. In this case, a spanning tree may have
degree up to |V| - 1. Finding a DCST in a graph is usually a hard task.
3 Solution Methodology
Fig. 1. A graph G and its two individuals are created from the above graph
Fitness function gives a value for each chromosome, which will help us to identify
the good individual among the population. Definition of fitness function is varies
problem to problem, depending upon the objective function. In this problem, fitness
function is a fraction, represented by the reciprocal of maximum degree of each
chromosome (degree of tree). ST with lowest degree represents the fittest individual
among population.
656 S. Sadhukhan, and S.S. Sarma
Fig. 2. Two individuals are crreated through graphical edge-set crossover from the individuals
edge from a particular set and deletion an edge from a created cycle in the tree de-
pending on the objective function, which leaves a new valid tree with better or same
fitness than the previous one. Fig. 3 gives an idea that how a new individual is created
from the old one using graphical edge-set directed mutation algorithm and it has been
also shown that how its degree would be better or same as the old individual.The
computational effort for the procedure is O(|E|).
Fig. 3. An individual is created (right) through mutation from another individual (left)
added edge. The created cycle is also included the vertices with the maximum degree or
second maximum degree. To remove the cycle, this algorithm removes an edge, which is
adjacent with a maximum degree vertex or a second maximum degree vertex. Such an
edge after being removed leaves a valid tree and degree of the adjacent vertices remains
either same or reduced by one. It proves the lemma.
5 Conclusion
This paper describes an approach to solve degree constrained spanning tree problem
using hybrid GA. This method gets the optimal solution, through randomly generating
a series of chromosomes, which represents a spanning tree in form of edge-set, and
being operated in the way of genetic algorithms.
Our results show that, this approach is competitive with existing traditional approx-
imation algorithm. The research work shows that, it is very important and crucial to
choose the encoding and selection strategy in optimizing combinatorial problems and
the graphical edge-set representation is really a fine encoding scheme for generating a
tree for this problem. Most importantly, our algorithm always produces only feasible
candidate solutions, if possible. Initial population, crossover and directed mutation is
done in O(|E|) time. After each iteration, weaker individuals are obsolete by the more
fit individuals, so, the average fitness value of the population is also increased.
This problem has huge importance in wireless ad-hoc network or mobile network
at the time of routing. If it is possible to generate a spanning tree dynamically at the
time of routing and whose degree is minimum, then noise and congestion will be re-
duced and routing will be faster. Similarly, it can be used in traffic signaling system,
air traffic control, electrical network etc. We are optimistic that our approach will
open up a vista of new solutions to a fascicle of practical problems.
References
1. Rosen, K.H.: Discrete Mathematics and its Application.TMH Edition (2000)
2. SenSarma, S., Dey, K.N., Naskar, S., Basuli, K.: A Close Encounter with Intractability. So-
cial Science Research Network (2009)
3. Raidl, R.G.: An Efficient Evolutionary Algorithm for the Degree Constrained Minimum
Spanning Tree Problem. In: Proceedings of the 2000 Congress on Evolutionary Computa-
tion, pp. 104–111 (2000)
4. Garey, M.R., Johnson, D.S.: Computers and Intractability – A guide to the Theory of NP
Completeness. W.H. Freeman and Company (1999)
5. Narula, S.C., Ho, C.A.: Degree-constrained minimum spanning tree. Computers and Opera-
tions Research 7, 239–249 (1980)
6. Zhou, G., Gen, M., Wut, T.: A New Approach to the Degree-Constrained Minimum Span-
ning Tree Problem Using Genetic Algorithm. In: IEEEInternational Conference on Systems,
Man and Cybernetics, pp. 2683–2688 (1996)
7. Knowles, J., Corne, D.: A New Evolutionary Approach to the Degree-Constrained Mini-
mum Spanning Tree Problem. IEEE Transaction on Evolutionary Computation 4, 125–134
(2000)
8. Zeng, Y., Wang, Y.: A New Genetic Algorithm with Local Search Method for Degree-
Constrained Minimum Spanning Tree problem. In: IEEE International Conference on Com-
putational Intelligence and Multimedia Applications, pp. 218–222 (2003)
9. Zhou, G., Meng, Z., Cao, Z., Cao, J.: A New Tree Encoding for the Degree-Constrained
Spanning Tree Problem. In: IEEE International Conference on Computational Intelligence
and Security, pp. 85–90 (2007)
Side Lobe Reduction and Beamwidth Control
of Amplitude Taper Beam Steered Linear Array
Using Tschebyscheff Polynomial and Particle Swarm
Optimization
Abstract. Present paper examines the various aspects of beam steered linear
array of isotropic radiators with uniform inter-element spacing. In order to
control beam broadening and to achieve side lobe level (SLL) reduction in
beam steered array, a novel method has been proposed modifying primarily the
search space definition for Particle Swarm Optimization. Tschebyscheff
polynomial and PSO has been used to develop the proposed method. Search
space for PSO has been defined using Tschebyscheff polynomial for an
amplitude taper beam steered linear array. PSO with the information of where
to search finds the optimum excitation amplitude of the beam steered linear
array to either achieve reduced Side Lobe Level and narrow beamwidth within
the beam steering range.
1 Introduction
The synthesis problem of reduced SLL and narrow beamwidth for linear array has been
the subject of many investigations till date. Conventional analytical methods such as
Tschebyscheff or Taylor Method are widely used [1],[2] for solving synthesis problem
related to uniformly space linear array of isotopic elements. For linear array, Dolph had
used Tschebyscheff polynomial to obtain a radiation pattern with equal SLL.
Another synthesis problem related to linear array is that of beam steering or beam
scanning widely used in Radio Detection and Ranging(RADAR) applications. Beam
steering in linear array can be achieved by either varying the phase excitations or
varying the frequency of operation [3],[4].
Conventional techniques cannot be used to solve multi objective synthesis
problems. This led to the introduction of evolutionary computing techniques [5],[7].
These techniques are used for achieving single objective like SLL reduction or
multiple objectives like SLL reduction and null placement. One such evolutionary
algorithm that is extensively used for solving different synthesis problems related to
linear array is Particle Swarm Optimization (PSO) [6],[7]. PSO has advantages
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 661
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_76, © Springer International Publishing Switzerland 2014
662 P. Mukherjee et al.
2n − 1
kd(cos θ − cos θs )
N
AF(θ) = a n cos (1)
n =1
2
From (1) it can be said that the array factor of an even numbered linear array is a
summation of cosine terms. This form is same as that of the Tschebyscheff
polynomials [1]. The unknown coefficients of the array factor can be determined by
equating the series representing the cosine terms of the array factor to the appropriate
Tschebyscheff polynomial [2] . The order of the polynomial should be one less than
the total number of elements of the array.
z
Broad side
End fire
θ y
2N
d
x
Fig. 1. Geometry of the 2N-isotropic element symmetric linear array placed along the y-axis
Side Lobe Reduction and Beamwidth Control 663
PSO is a stochastic optimization technique that has been effectively used to solve
multidimensional discontinuous optimization problems in a variety of fields [8],
[9],[10]. Let the optimization objective is to achieve the desired SLL reduction by
varying the excitation amplitude. According to PSO swarm of bees is going to search
for the location where desired SLL is present. Swarm of bees here represents possible
solution sets. Location of each solution set is being defined by the values of excitation
amplitudes. During the search process each solution set updates its location and
velocity on two pieces of information. The first is its ability to remember the previous
location where it has found the desired SLL (particle best (pbest)). The second
information is all about the location where SLL close to desired SLL is found by all
the solution sets (bees) of the swarm (global best (gbest)) at present instant of time. This
process of continuous updating of velocity and position continues until one of the
solution set (bee) finds the location of desired SLL within the defined search space.
This location gives the value of optimum excitation amplitude.In present application
of PSO, particle’s velocity is manipulated according to the following equation:
The criterion for termination depends upon either of the two end results stated. The
iteration continues until it equals the pre-defined value of number of iterations or till
the desired value is acquired.
4 Proposed Method
Conventional PSO algorithm initially does not know where to search for the optimum
values. The search space has to be defined randomly in order to begin the optimization
process. In the proposed method the excitation amplitude of each array element is
calculated using the method developed by Dolph [2]. From the solution upper and
lower limit of search space is defined for the optimization process. This facilitates PSO
with the knowledge of where to search. In PSO concept of fitness function guides the
bees during their search for the optimum position within the search space. If it is
required to obtain only the desired SLL (SLL (dB)d) then the following function is used
to assess the fitness.
664 P. Mukherjee et al.
AF(θ)
f = SLL (dB) d − max 20 log (4)
AF(θ) max
The above fitness function has two parts. First part is the desired SLL and second
part is the calculated SLL obtained from PSO algorithm. At each iteration by varying
the excitation amplitude PSO tries to reduce the difference value until it reaches 0.To
obtain both the desired SLL (SLL (dB)d) and beamwidth following function is used to
assess the fitness.
AF(θ)
f = SLL (dB) d − max 20 log + β FNBWc − FNBW d (5)
AF(θ) max
where FNL and FNR are left and right first nulls around the broadside main beam. For
symmetrically specified beamwidth (BW) these values are
Here , null_left represents the set of null positions (value of 0) in the array factor
expression less than s. And null_right represents the set of null positions in the array factor
expression greater than s.
In this section capability of proposed method has been demonstrated to check beam
broadening and to achieve desired SLL in case of beam steered uniformly spaced
linear array. If the optimization objective is to achieve only the desired SLL then
fitness function as defined in (4) is to be used. Using this fitness function and
considering linear array of 12 isotopic elements PSO finds the optimum excitation
for different main beam position within the defined scan angle range for a desired
SLL of -20 dB. Fig. 2 shows the radiation pattern of an optimized 12 element linear
array and conventional DolphTschebyscheff array. The main beam is positioned at
80 degree. Corresponding convergence curve is shown in Fig. 3. Number of
iterations in the convergence curves (Fig.3,Fig.5), shows number of iteration cycle
in the best run.
Side Lobe Reduction and Beamwidth Control 665
Different parameters for this array with different main beam position are
summarized in Table I. From Table I it is observed that the desired objective is met
for most values of beam steering position. Only for the values of beam steering close
to end fire positions (0° or 180°) the desired SLL is not achieved but is close to the
desired value.
Fig. 2. Radiation pattern of 12 elements DolphTschebyscheff array and optimized linear array
with desired SLL of -20 dB and main beam at 80 degree
Fig. 3. Convergence curve for 12 elements optimized linear array with main beam placed at 80°
and SLL of -20 dB
By using the proposed method both SLL reduction and beamwidth control is to be
achieved for beam steered linear array. A 12 element linear array of isotropic
elements with uniform spacing of 0.5λ is considered as an example. The desired SLL
level is set at -20 dB and with different main beam position broadside FNBW is taken
as desired FNBW. The radiation pattern of conventional DolphTschebyscheff array
and optimized array is given in Fig.4. Number of elements for both the cases is 12 and
main beam position is at 120°.
666 P. Mukherjee et al.
Table 1. Parameters of uniformly spaced 12 elements optimized array for different main
beam position and desired SLL is -20dB
Fig. 4. Radiation pattern of 12 elements DolphTschebyscheff array and optimized linear array
at desired SLL of -20 dB with main beam at 120 degree
Side Lobe Reduction and Beamwidth Control 667
Fig. 5. Convergence curve for 12 elements linear array with main beam placed at 120° and SLL
of -20 dB
Table 2. Excitation amplitude and directivity of 12 elements optimized array for different main
beam position and SLL of -20dB
s
Directivit Optimized array
(deg.) y (dB) (Excitation amplitude)
80 10.6364 2.0871 1.9724 1.7778 1.4491 1.1959 1.4473
70 10.6302 2.0239 1.9277 1.7185 1.3622 1.2313 1.3079
60 10.6969 1.9053 1.8228 1.6585 1.3679 1.2277 1.6694
50 10.6638 1.7067 1.7487 1.4227 1.4226 1.3871 2.1995
40 10.4833 1.3391 1.1998 1.3032 1.1896 1.8507 2.3195
100 10.6091 2.0547 1.8565 1.7426 1.3091 1.2216 1.2516
110 10.6264 2.0493 1.8972 1.8010 1.3254 1.2642 1.3103
120 10.7001 1.8695 1.8233 1.4441 1.5305 1.1920 1.5459
130 10.6842 1.7738 1.5386 1.3630 1.5058 1.3100 2.0448
140 10.599 1.1952 1.3872 1.2437 1.4313 1.8598 2.0847
6 Conclusion
In this paper, various aspects of non-uniformly excited beam steered symmetric linear
array with equal SLL has been descried elaborately. Simulated results reveals, for a
particular linear array configuration there is a limit of beam steering positions. It has
been shown that in beam steered linear array the main beam is not symmetrical around
the main beam position. Also with beam steering there is beam broadening in a linear
array. In order to address the problem of beam broadening along with reduced SLL a
new method has been proposed. The new method has been developed using
Tschebyscheff polynomial and PSO. Simulated results shows, the proposed method
has the ability to handle single objective of SLL reduction as well as multiple
objectives of SLL reduction and beamwidth control for a linear array within the beam
steering range. Hence, in multiple objectives scenario proposed method partially
achieves the desired objectives and needs further investigation for beam steered linear
array with reduced SLL and narrow beamwidth.
References
1. Taylor, T.T.: One Parameter Family of Line Sources Producing Modified Sin(πu)/πu
Patterns. Hughes Aircraft Co. Tech. Mem. 324, Culver City, Calif. Contract AF 19(604)-
262-F-14 (1953)
2. Dolph, C.: A Current Distribution for Broadside Arrays which Optimizes the Relationship
between Beamwidth and Side Lobe level. Proceeding IRE 34(5), 335–348 (1946)
3. Balanis, C.A.: Antenna theory Analysis and Design, 3rd edn. John Wiley, New York
(2005)
4. Mailloux, R.J.: Phased Array Antenna Handbook, 2nd edn. Artech House Inc. (2005)
5. Yan, K.K., Lu, Y.: Sidelobe reduction in arraypattern synthesis using genetic algorithm.
IEEE Transactions on Antenna and Propagation 45(7), 1117–1121 (1997)
6. Khodier, M.M., Christodoulou, C.G.: Linear array geometry synthesis with minimum
sidelobe level and null control using particle swarm optimization. IEEE Transactions on
Antenna and Propagation 53(8), 2674–2679 (2005)
7. Chatterjee, S., Chatterjee, S., Poddar, D.R.: Side Lobe Level Reduction of a Linear Array
using Tschebyscheff Polynomial and Particle Swarm Optimization. In: International
Conference on Communications, Circuits and Systems, KIIT University (2012)
8. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of IEEE
International Conference Neural Networks, vol. IV, pp. 1942–1948 (1995)
9. Ratnaweera, A., Halgamuge, S.K., Watson, H.C.: Self-Organizing Hierarchical Particle
Swarm Optimizer with Time-Varying AccelerationCoefficients. IEEE Transactions on
Evolutionary Computation 8(3), 240–255 (2004)
10. Venter, G., Sobieszczanski-Sobieski, J.: Particle Swarm Optimization. AIAA
Journal 41(8), 1583–1589 (2003)
Pervasive Diary in Music Rhythm Education: A Context-
Aware Learning Tool Using Genetic Algorithm
1 Introduction
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 669
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_77, © Springer International Publishing Switzerland 2014
670 S. Chakrabarty, S. Roy, and D. De
identify the learning objectives that the user is really interested in; to propose learning
activities to the user; and, to lead the user around the learning environment consist of
various computing devices like PDA, wireless sensors, and servers.
Genetic Algorithm can be considered as multi-objective optimization techniques
that implements optimization by simulating the biological natural law of evolution. A
population has been initialized of arbitrarily generated solution. Each of the solution
is evaluated to resolve the fitness value. A terminating condition has been test, if the
solutions are good then terminate otherwise the solutions have to be optimized again.
Hence, the best solutions are picked from the initial set of chromosomes. Then, the
chromosomes having higher fitness values, exchange their information, to acquire the
better solutions. This may then randomly be mutated some small percentage of the
solutions thus obtained after crossover. Then again, each of the solution has been
estimated, and checked the termination condition. Music Rhythmology as a context
awareness learning process and applicable in the pervasive education as a huge
number of students are interested to gather music knowledge.
The idea of the tool introduces XML schema as sensor data through sensor devices.
A context rhythm editor is also designed by employing schema creating versatile
rhythms using genetic Algorithm for supporting intelligent context mapping between
sensor devices and context model.
2 Related Works
There are some previous research works that inspired to perform to these experiments.
These works contributes a lot in automated musical research. Some works have been
done on the use of pervasive computing in teaching and learning environment. In
Paper [1], they present two approaches for the scalable tracking of mobile object
trajectories and the efficient processing of continuous spatial range queries. At ASU,
a Smart classroom [2-3] is built that used pervasive computing technology to enhance
collaborative learning. Pervasive computing devices enable the children to utilize a
pen reader to store text from a book into their PDA’s Users like to employ the same
device in diverse fields. In paper [4] judges numerous perceptual concerns in machine
recognition perspective of musical patterns. This paper also proposed several
measures of rhythm complexity and a novel technique to resolve a restricted tonal
context. Further some other papers [5], [9] deal with the implementation of Musical
pattern recognition by mathematical expressions. Again consider some other papers
that are based on the creativity of music using Genetic Algorithm technique. The
contributions describe the study of usefulness of Genetic Algorithm [6-7] for music
composition. The paper [8] produces a new rhythm from a pre-defined rhythm applied
to initial population using modified Genetic Algorithm operator. Another paper
introduces a new concept of automated generating of realistic drum-set rhythm using
Genetic Algorithm [10], [14]. Some researchers introduce a system for recognition of
the patterns of Indian music by using key sequences with Median Filter and an effort
suggested that three measurement techniques of rhythm complexity in a system for
machine recognition of music pattern and also find pitch and rhythm error [13].
Pervasive Diary in Music Rhythm Education: A Context-Aware Learning Tool 671
Additionally in several efforts by the authors have been described the musical pattern
recognition and rhythmic features exposures by object-oriented manner [11-12]. A lot
of discussions are available [15-16] about the features classification using Petri Nets
and the way of implementations of musical complete composition of vocal and
rhythmic cycles for the percussion-based tempo.
Rhythm plays very important role in performing Music. Music is the systematic
combination of tempo along with the rhythm. For generation of offspring rhythm
about some common terminologies and their meanings in rhythm have to be known.
They are given in Table 1.
4 Experimental Details
Some applications of Music Rhythm Selector Diary in the field of Pervasive Music
Education and Learning are: Raga Identification, Pattern Identification of music
rhythm, versatile rhythm beat creation and Musicology
This proposed algorithm is applicable in the social network sites. The flowchart of
proposed work is depicted in the Fig. 2 and the overall architecture of this anticipated
concept is demonstrated in the Fig. 3.
Rhythm 1 1 0 1 0 0 1 0 1 0
Rhythm 2 1 1 0 0 0 1 Rhythm 1 1 0 0
Rhythm 3 1 0 1 0 1 Matrix 1 0 1 0
Rhythm 4 1 0 1 0 1 0 1 0
In the above Table 2, Rhythm 4 has lowest rhythm length, so first 4 beats have
been taken from each rhythm and generate rhythm matrix. The creation of rhythm
matrix from initial rhythms is described in Table 3.
4.5 Parent Music Rhythm Selection Strategy Using Linear Rank Method
Selection is the first most important operator of Genetic Algorithm that applied on the
initial population or chromosomes. Chromosomes are selected in the population to be
parents to crossover and produce offspring. Linear Rank Selection mechanism is
being utilized for selection process.
(Rank - 1) (SP - 1)
Scaled Rank = sp - 2 × (1)
(N - 1)
Step 6: Choose two strings that have lowest Scaled Rank values in both SP = 1.9 and
SP = 1.1.
the following six individual chromosomes are created as initial population. After
getting all the initial chromosomes, Linear Rank Selection Mechanism has been
applied for parent selection. The six chromosomes are:
String 1 = 1111111110000111
String 2 = 1000100000000000
String 3 = 1001100100000001
String 4 = 1001001111001100
String 5 = 1100110000001100
String 6 = 1001001100001100
From Table 4, the two fittest chromosomes for crossover operation have been
found out that contains first and second lowest Scaled Rank. In the above Table 4, the
two parent rhythms are chromosome1 and Chromosome5. Therefore String 1 and
String 5 have been chosen as Parent 1 and Parent 2 for produce better Offspring
Rhythm as they have lowest Scaled Rank Value in both Selective Pressure factors.
From design of Pervasive Diary is depicted in Fig. 4. It consists of three buttons for
processing this work. Fig. 5 depicts the taken inputs from user in the Pervasive Diary
Software. Fig. 6 represents the parent rhythm selection for rhythm generation using
Linear Rank selection Mechanism.
Fig. 4. Designing interface of pervasive diary Fig. 5. User Interface of taking inputs
676 S. Chakrabarty, S. Roy, and D. De
Fig. 5. User interface of parent rhythm selection using rank selection algorithm
6 Conclusion
References
1. Kurt, R., Stephan, S., Ralph, L., Frank, D., Tobiasm, F.: Context-aware and quality-aware
algorithms for efficient mobile object management. Elsevier Journal of Pervasive and
Mobile Computing, 131–146 (2012)
2. Bonwell, C., Eison, J.: Active learning: Creating excitement in the classroom. ASHE-
ERIC Higher Education Report No. 1, George Washington University, Washington, DC
(1991)
Pervasive Diary in Music Rhythm Education: A Context-Aware Learning Tool 677
3. Yau, S.S., Gupta, S.K.S., Karim, F., Ahamed, S.I., Wang, Y., Wang, B.: Smart Classroom:
Enhancing Collaborative Learning Using Pervasive Computing Technology. In:
Proceedings of Annual Conference of American Society of Engineering Education (2003)
4. Shmulevich, I., Yli-Harja, O., Coyle, E.J., Povel, D.: Lemstrm K.: Perceptual Issues in
Music Pattern Recognition Complexity of Rhythm and Key Finding. In: Computers and
the Humanities, Kluwer Academic Publishers, pp. 23–35 (2001)
5. Chakraborty, S., De, D.: Pattern Classification of Indian Classical Ragas based on Object
Oriented Concepts. International Journal of Advanced Computer engineering &
Architecture 2, 285–294 (2012)
6. Gartland-Jones, A., Copley, P.: The Suitability of Genetic Algorithms for Musical
Composition. Contemporary Music Review 22(3), 43–55 (2003)
7. Matic, D.: A Genetic Algorithm for composing Music. Proceedings of the Yugoslav
Journal of Operations Research 20(1), 157–177 (2010)
8. Dostal, M.: Genetic Algorithms as a model of musical creativity – on generating of a
human-like rhythimic accompainment. Computing and Informatics 22, 321–340 (2005)
9. Chakraborty, S., De, D.: Object Oriented Classification and Pattern Recognition of Indian
Classical Ragas. In: Proceedings of the 1st International Conference on Recent Advances
in Information Technology (RAIT), pp. 505–510. IEEE (2012)
10. Alfonceca, M., Cebrian, M., Ortega, A.: A Fitness Function for Computer-Generated
Music using Genetic Algorithms. WSEAS Trans. on Information Science &
Applications 3(3), 518–525 (2006)
11. De, D., Roy, S.: Polymorphism in Indian Classical Music: A pattern recognition approach.
In: Proceedings of International Conference on communications, Devices and Intelligence
System (CODIS), pp. 632–635. IEEE (2012)
12. De, D., Roy, S.: Inheritance in Indian Classical Music: An object-oriented analysis and
pattern recognition approach. In: Proceedings of International Conference on RADAR,
Communications and Computing (ICRCC), pp. 296–301. IEEE (2012)
13. Bhattacharyya, M., De, D.: An approach to identify thhat of Indian Classical Music. In:
Proceedings of International Conference of Communications, Devices and Intelligence
System (CODIS), pp. 592–595. IEEE (2012)
14. Chakrabarty, S., De, D.: Quality Measure Model of Music Rhythm Using genetic
Algorithm. In: Proceedings of the International Conference on RADAR, Communications
and Computing (ICRCC), pp. 203–208. IEEE (2012)
15. Roy, S., Chakrabarty, S., Bhakta, P., De, D.: Modelling High Performing Music
Computing using Petri Nets. In: Proceedings of International Conference on Control,
Instrumentation, Energy and Communication, pp. 757–761. IEEE (2013)
16. Roy, S., Chakrabarty, S., De, D.: A Framework of Musical Pattern Recognition using Petri
Nets. Accepted In: Emerging Trends in Computing and Communication 2014, Springer-
Link Digital Library, Submission No- 60 (2013)
An Elitist Binary PSO Algorithm for Selecting
Features in High Dimensional Data
1 Introduction
Many high dimensional datasets are like microarray gene expressions, text doc-
uments, digital images, clinical data, etc having thousands of features, many of
these features are irrelevant or redundant. Unnecessary features increase the size
of the search space and make generalization more difficult. This curse of dimen-
sionality is a major obstacle in machine learning and data mining. When data
is high dimension, the process of learning is very hard [1]. Hence, the feature
reduction techniques are one of the important tool to reduce and select useful
feature subsets that maximizes the prediction or classification accuracy.
There are two main reasons to keep the dimensionality reduction (i.e., the
number of features) as small as possible: measurement cost and classification ac-
curacy. A limited yet salient feature set simplifies both the pattern representation
and the classifiers that are built on the selected representation. Consequently,
the resulting classifier will be faster and will use less memory. Moreover, as stated
earlier, a small number of features can alleviate the curse of dimensionality when
the number of training samples is limited. On the other hand, a reduction in the
number of features may lead to a loss in the discrimination power and thereby
lower the accuracy of the resulting recognition system.
There are two main dimension reduction methods: i. Feature Extraction, and
ii. Feature Selection. Feature extraction methods determine an appropriate sub-
space of dimensionality n (either in a linear or a nonlinear way) in the original
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 679
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_78, c Springer International Publishing Switzerland 2014
680 S. Dara and H. Banka
2 Preliminaries
This section formally describes the basics of binary particle swarm optimization,
the dominance criteria and non-dominated sorting algorithm along with the K-
NN classifier as continuation and understanding of the present work.
An EBPSO for Selecting Features in High Dimension Data 681
Vid (t + 1) = w ∗ Vid (t) + c1 ρ1 (Pid (t) − Xid (t)) + c2 ρ2 (Pg (t) − Xid (t)) (1)
The above update process implemented for all dimensions, and for all particles.
The concept of optimality, behind the multi objective optimization [9], deals with
a set of solutions. The conditions for a solution to be dominated with respect
to the other solutions are given as follows. If there are M objective functions, a
solution s1 is said to dominate another solution s2 , if both conditions 1 and 2
are true. 1). The solution s1 is no worse than s2 in all the M objective functions.
2). The solution s1 is strictly better than s2 in at least one of the M objective
functions. Otherwise, the two solutions are non-dominating to each other. When
a solution i dominates a solution j, then rank ri < rj . The major steps for finding
the non-dominated set in a population P of size |P | are outlined as follows in
Algorithm 1.
682 S. Dara and H. Banka
3 Proposed Approach
3.1 Preprocessing Data
Preprocessing directs to eliminating of ambiguously expressed genes as well as
the constantly expressed genes across the tissue classes. The normalization is
performed on each of the attributes so that it falls between 0.0 and 1.0. This
helps us to give equal priority to each of the attributes as there is no way of
knowing important or unimportant attributes/features. The gene data set, i.e.,
continuous attribute value table is normalized between range (0,1). Then we
choose thresholds T hi and T hf , based on the idea of quartiles [10]. Then convert
features values to binary (0 or 1 or *) as follows:
If a ≤ T hi , put ’0’, If a ≥ T hf then put ’1’, Else put ’*’ don’t care
We find the average of ’*’ occurrences over the feature table. Choose this as
threshold T hd . Those attributes are remove from the table which has the number
of ‘∗’s ≥ T hd . The distinction table is prepared accordingly as done in [10].
The distinction table consist of N columns, and rows corresponding to only
those object pairs discern, where ’1’ signifies the present of gene and ’0’ as
absence of gene. If a dataset contain two classes then the number of rows in the
distinction table becomes (C1 ∗ C2 ) < C∗(C−1) 2 , where C1 + C2 = C. A sample
distinction table in shown in Table 1.
Here, assume that there are five conditional features {f1 , f2 , f3 , f4 , f5 }, the
length of vector is N = 5. In a vector v, the binary data ‘1’ represents, if the
corresponding feature is present, and a ‘0’ represents its absence. The two classes
are C1 (C11 , C12 ) and C2 (C21 , C22 , C23 ). The rows represent the object pairs and
columns represent the features or attributes. The objective is to choose minimal
number of column (features) from Table 1 that covers all the rows (i.e., object
pairs in the table) as defined by the fitness functions.
An EBPSO for Selecting Features in High Dimension Data 683
f1 f2 f3 f4 f5
(C11 , C21 ) 1 0 1 0 1
(C11 , C22 ) 0 0 0 1 0
(C11 , C23 ) 1 0 1 0 0
(C12 , C21 ) 0 0 1 0 1
(C12 , C22 ) 0 1 0 1 0
(C12 , C23 ) 1 0 0 0 1
4.2 Results
We have implemented the elitist BPSO Algorithm to find minimal feature subsets
on high dimensional cancer datasets; i.e., colon, lymphoma, and leukemia. We
set two accelerator coefficients parameters (c1 & c2 ) to 2, and the minimum, and
maximum of velocities were set to -4 and 4, respectively. The inertia weight (w)
is one of the most important parameter in BPSO which can improve performance
by properly balancing its local and global search [11]. The inertia weight(w) was
set between 0.4 to 0.9. The varied population size was taken, set maximum runs
as 50, also tested different population sizes like 10, 20, 30 and 50.
To check efficiency of the proposed algorithm, k–NN classifier is used as a
validation tool. The results are taken in form of correct classification accuracy.
The experimental results are carried out on three bench mark datasets as sum-
marized in Table 3. Note that k is chosen to be an odd number to avoid the
ties. The correct classification are reported to be 93.54%, 95.85% and 94.74% for
those three datasets with varies swarm size and k values.
Table 4 depicts the comparative performance studies with simple GA and
NSGA-II [10] for the same bench mark datasets i.e, colon, Leukemis, and Lym-
phoma data. Using GA with 15 gene set produces a classification of 77.42%, and
for lymphoma data, 18 gene subset produces 93.76%, and for leukemia data us-
ing 19 gene subset, the classification score is 73.53%. The NSGA-II based feature
selection method, using k − N N classifier reported 90.3% on colon data with 9
gene set, for lymphoma 95.8% for 2 gene subset, and for leukemia 91.2% on 3
An EBPSO for Selecting Features in High Dimension Data 685
gene subset. We got 100% classification score for all gene sets using one neighbor,
for three data sets, 93.54%, 95.85% and 94.74% using 9, 20, and 14 gene subsets
with respective data sets using k = 3, 5, 7.
5 Conclusion
In this paper, we proposed a elitist model BPSO to find future subset selection
in gene expression microarray data. Non-dominating sorting helps to preserve
Pareto-font solutions. Our preprocessing aids faster convergence along the search
space and successfully employed to eliminate redundant, and irrelevant features.
686 S. Dara and H. Banka
References
1. Lazar, C., et al.: A survey on filter techniques for feature selection in gene expres-
sion microarray analysis. IEEE/ACM Transactions on Computational Biology and
Bioinformatics 9(4), 1106–1119 (2012)
2. Inza, I., Saeys, P.L.Y.: A review of feature selection techniques in Bioinformatics.
International Journal of Computer Science (IAENG) 23(19), 2507–2517 (2007)
3. ElAlami, M.E.: A filter model for feature subset selection based on genetic algo-
rithm. Knowledge-Based Systems 22(5), 356–362 (2009)
4. Sainin, M.S., Alfred, R.: A genetic based wrapper feature selection approach using
nearest neighbour distance matrix. In: 2011 3rd Conference on Data Mining and
Optimization (DMO), pp. 237–242 (2011)
5. Wahid, C.M.M., Ali, A.B.M.S., Tickle, K.S.: A novel hybrid approach of feature
selection through feature clustering using microarray gene expression data. In: 2011
11th International Conference on Hybrid Intelligent Systems (HIS), pp. 121–126
(2011)
6. Nagi, S., Bhattacharyya, D.: Classification of microarray cancer data using ensem-
ble approach. Network Modeling Analysis in Health Informatics and Bioinformat-
ics, 1–15 (2013)
7. Mladenič, D.: Feature selection for dimensionality reduction. In: Saunders, C., Gro-
belnik, M., Gunn, S., Shawe-Taylor, J. (eds.) SLSFS 2005. LNCS, vol. 3940, pp.
84–102. Springer, Heidelberg (2006)
8. Kennedy, J., Eberhart, R.C.: A discrete binary version of the particle swarm algo-
rithm. In: IEEE International Conference on Systems, Man, and Cybernetics, 1997
Computational Cybernetics and Simulation, vol. 5, pp. 4104–4108 (1997)
9. Deb, K.: Multi-objective optimization. Multi-objective Optimization Using Evolu-
tionary Algorithms, 13–46 (2001)
10. Banerjee, M., Mitra, S., Banka, H.: Evolutionary rough feature selection in gene
expression data. IEEE Transactions on Systems, Man, and Cybernetics, Part C:
Applications and Reviews 37(4), 622–632 (2007)
11. Shi, Y., Eberhart, R.: Empirical study of particle swarm optimization. In: Proc.
IEEE Congress, vol. 3, pp. 1945–1950 (1999)
An Immune System Inspired Algorithm
for Protein Function Prediction
1 Introduction
The main problem in molecular biology is to understand the function of a protein, as
the function of most of the proteins is unknown. It has been observed that even the
most studied species, Saccharomyces cerevisiae, is reported to have more than 26
percent of its proteins with unknown molecular functions [1]. Huge amount of data
continue to accumulate due to the application of high throughput technologies in vari-
ous genome projects. Protein-Protein Interaction (PPI) is an important source of in-
formation among these databases. The introduction of high-throughput techniques
have resulted in an amazing number of new proteins been identified. However, the
function of a large number of these proteins still remains unknown.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 687
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_79, © Springer International Publishing Switzerland 2014
688 A. Chowdhury et al.
Several algorithms have been developed to predict protein functions, on the basic
assumption that proteins with similar functions are more likely to interact. Among
them, Deng proposed the Markov random field (MRF) model, which predicts protein
functions based on the annotated proteins and the structure of the PPI network
[2]andSchwikowski proposed neighbor counting approach [3]. In recent years, more
and more research turned to predict protein functions semantically by combining the
inter-relationships of function annotation terms with the topological structure infor-
mation in the PPI network. To predict protein functions semantically, various methods
were proposed to calculate functional similarities between annotation terms. Lord et
al.[4] were the first to apply a measure of semantic similarity to GO annotations. Re-
sink [5] used the concept of information content to calculate the semantic similarity
between two GO terms.
In this paper, we aim to predict the function of an unannotated protein by using the
topographical information of PPI network and function of annotated protein. The
similarity of GO terms used to annotate proteins is measured using the information
content of the respective terms as well as the terms which are common in the path
from root to the GO terms. For this task we have employed the use of an immune
system-inspired meta-heuristic algorithm, known as Clonal Selection Algorithm
(CSA).The proposed method used a hypermutation strategy which provides the explo-
ration capability to individual clone within the search space.
The rest of this paper is organized as follows: Section 2 give a brief idea about the
definition and formulation of the problem as well as the scheme for solution represen-
tation. Section 3 provides an overview of the proposed. Experiments and Results are
provided in Section 4.Section 5 concludes the paper.
PPI network with N proteins is considered. The PPI network can be represented in the
form of a binary data matrix KN×N where kij=kji=1andkij=kji=0 denotes the presence
and absence of interaction between the proteins pi and pj respectively. The set of all
functions for each protein p, is denoted as F(p) thus, the set of all possible functions
in the network is defined as F= F(p1)U F(p2)U… U F(pN) with number of all possible
functions in the network |F|=D.
Given the PPI data matrix KN×N, a protein function prediction algorithm tries to find
a set of possible functions F(p) of an unannotated protein p based on the functions
F(p/) of all annotated proteins p/ in the PPI network. Since the functions can be as-
signed to the unannotated protein p in a number of ways, a fitness function (measur-
ing the accuracy of the function prediction) must be defined. The problem now turns
out to be an optimization problem of finding a set of functions F(p) of optimal ade-
quacy as compared to all other feasible sets of functions for unannotated protein p.
An Immune System Inspired Algorithm for Protein Function Prediction 689
The effectiveness of protein function prediction can be improved by taking the com-
posite benefit of the topological configuration of the PPI network and the functional
categories of annotated proteins through Gene Ontology (GO) [8], [9]. The protein
functions are annotated using GO terms. GO is basically represented as a directed
acyclic hierarchical structure in which a GO term may have multiple parents/ancestor
GO terms. The probability, prob(t), for each GO term t in the GO tree is frequency of
occurrence of the term and its children divided by the maximum number of terms in
the GO tree. Thus the probability for each node/GO term will increase as we move up
towards the root. The information content of a GO term in the GO tree is based on the
prob(t) value and is given by
Here (1) shows that lesser is the probability of the GO term, more will be the in-
formation content associated with it. The similarity between two GO terms will be
high if they share more information. The similarity between two terms, which is cap-
tured by the set of common ancestors, is the ratio of probability of the common terms
between them to the probability of the individual terms.It can be represented as fol-
lows:
2 log prob(t )
S (t1 , t 2 ) = max (2)
t∈C (t1 ,t2 ) log prob(t1 ) + log prob(t 2 )
Where C(t1, t2) denotes the set of common ancestors of terms t1 and t2, S(t1,t2)
measures the similarity with respect to information content, in terms of the common
ancestors of the terms t1 and t2.The value of the above similarity measure ranges be-
tween 0 and 1. With this representation scheme of protein functions, the similarity
between a predicted function f ∈ F(p) of unannotated protein p and a real function f /
∈ F of the PPI network can be computed as follows.
sim ( f , f ′) = S (t 1 , t 2 ) (3)
/
Where function fis annotated by t1 and f is annotated by t2. Next, the score of the
unannotated protein being annotated by the predicted function f is evaluated by (4).
N 1
score( p, f ) = × sim( f , f ′)
j =1, dist ( p, p j )
f ′∈F ( p j )
(4)
Here dist(p, pj) represents the minimum number of edges between proteins p and
pj. Two important facts are included in (4), First is that, it assigns function f to protein
p based on the similarity between f and all other protein functions available in the
given network which conforms to the fact that theproteins with similar functions inte-
ract more frequently to construct the PPI network, secondly the term 1/dist(p, pj) is
included because proteins far away from p contribute less functional information than
those having direct interaction with p. This is accomplished by assigning less weight
to the proteins far away from p than its close neighbors. From (4), it is apparent that a
690 A. Chowdhury et al.
0 1 1 0 1 0 1 0
Fig. 1. Solution encoding scheme in the proposed method
In order to judge the quality of such a solution X i , for function prediction, the con-
tribution by the entire set of predicted functions (denoted by set F(p)= {fj| xi,j=1 for j=
[1, D]}) to annotate protein p is used for fitness function evaluation. Symbolically,
fit ( X i ) = score( p, f ) (6)
∀f ∈F ( p )
D. Hypermutation
Each clone of X k (t ) , denoted as X kl (t ) for j= [1, D], k= [1, n] and l= [1, ck] undergoes
through the static hypermutation process using (9).
x kl , j (t + 1) = x kl , j (t ) + α × x kl , j (t ) × ( x max
j j ) × G(0, σ )
− x min (9)
Here α is a constant, however small and G(0, σ) is a random Gaussian variable with
zero mean and σ as the standard deviation. Usually, σis taken as 1 [7]. The fitness
fit ( X kl (t )) is evaluated for l= [1, ck].
E. Clonal Selection
Let the set of matured clones (after hypermutation) corresponding to the k-th antibo-
c
dy, including itself, is denoted as Sk= { X k (t ), X 1k (t + 1), X k2 (t + 1),..., X k j (t + 1)} for
k= [1, n]. The best antibody in Sk with highest fitness is allowed to pass to the next
generation. Symbolically,
X k (t + 1) ← arg max
( fit ( X )) (10)
∀X ∈Sk
F. Replacement
The NP–n antibodies not selected for cloning operation are randomly re-initialized
as in (7).
After each evolution, we repeat from step B until one of the following conditions
for convergence is satisfied. The conditions include restraining the number of itera-
tions, maintaining error limits, or the both, whichever occurs earlier.
692 A. Chowdhury et al.
The GO terms [8] and GO annotation dataset [9] used in the experiments were down-
loaded from Saccharomyces Genome Database (SGD). We filtered out all regulatory
relationships, and maintain only the relationships resulting in 15 main functional cat-
egories for Saccharomyces cerevisiae as given in [10]. Protein-Protein interaction data
of Saccharomyces cerevisiae were obtained from BIOGRID [11] database
(http://thebiogrid.org/). To reduce the effect of noise, the duplicated interactions and
self-interactions were removed. The final dataset consists of 69,331 interaction pro-
tein pairs involving 5386 annotated proteins. Let {fr1, fr2, …,frn} be the set of n real
functions of protein p and {fp1, fp2, …, fpm} denotes the set of m functions predicted by
protein function assignment scheme. It is obvious that 1≤m, n≤ D. The three perfor-
mance metrics used to evaluate the effectiveness of our proposed method are:
m n
max ( sim ( f ri , f p j ))
j =1 i =1
Precision = (11)
m
n m
max ( sim( f ri , f p j ))
i =1 j =1
Recall = (12)
n
2 × Precision × Recall
F − value = (13)
Precision + Recall
An algorithm having higher values for the above metrics supersedes others. The
evaluation of these metrics were conducted on test datasets by varying the number of
proteins in the network N= [10, 200] for a particular unannotated protein. We have
used only biological process for our experiment as assigning biological process to
unannotated protein includes biological experiments which are very costly.In our
study, we have compared the relative performance of the proposed scheme with Fire-
fly Algorithm (FA) [12], Particle Swarm Optimization (PSO) [13], and also with the
existing methods like Indirect Neighbor Method (INM) [14] and Neighbor Counting
(NC) [3] in Table 1 and Fig. 2, Fig. 3, Fig. 4, for predicting functions of protein
YBL068W.We report here results for only the above mentioned protein in order to
save space.The omitted results for different proteins follow a similar trend as those
stated above. The proposed approach was applied on annotated proteins of the net-
work as the real functions for the same will be known to us. For CSA, cmin and cmax
are set to 2 and 10 respectively.For all the evolutionary/swarm algorithm-based
prediction schemes, the population size is kept at 50 and the maximum function eval-
uations (FEs) is set as 300000 and also best parametric set-up for already existing
method is used. It is evident from Table1 and Fig. 2-4 that our algorithm outperforms
others with respect to the aforementioned performance metrics irrespective of number
of proteins in the network.
An Immune System Inspired Algorithm for Protein Function Prediction 693
0.9 NC
INM
PSO
0.8 FA
CSA
0.7
Precision
0.6
0.5
0.4
20 40 60 80 100
Number of Proteins
0.9 NC
INM
PSO
0.8 FA
CSA
0.7
Recall
0.6
0.5
0.4
20 40 60 80 100
Number of Proteins
0.9 NC
INM
PSO
0.8 FA
CSA
0.7
F-value 0.6
0.5
0.4
20 40 60 80 100
Number of Proteins
5 Conclusion
References
1. Breitkreutz, B.J., Stark, C., Reguly, T., Boucher, L., Breitkreutz, A., Livstone, M.,
Oughtred, R., Lackner, D.H., Bähler, J., Wood, V., Dolinski, K., Tyers, M.: The BioGRID
Interaction Database: 2008 Update. Nucleic Acids Research 36, D637– D640(2008)
2. Deng, M.H., Zhang, K., Mehta, S., Chen, T., Sun, F.Z.: Prediction of protein function us-
ing protein-protein interaction data. Journal of Computational Biology 10(6), 947–960
(2003)
3. Schwikowski, B., Uetz, P., Field, S.: A network of protein protein interactions in yeast.
Nature Biotechnology 18, 1257–1261 (2000)
4. Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity meas-
ures across the Gene Ontology: the relationship between sequence and annotation. Bioin-
formatics 19(10), 1275–1283 (2003)
5. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In:
Proceedings of International Joint Conference for Artificial Intelligence, pp. 448–453
(1995)
6. Castro, D., Nunes, L., Zuben, F.J.V.: The clonal selection algorithm with engineering ap-
pli-cations. In: Proceedings of GECCO, pp. 36–39 (2000)
An Immune System Inspired Algorithm for Protein Function Prediction 695
7. Felipe, C., Guimarães, F.G., Igarashi, H., Ramírez, J.A.: A clonal selection algorithm for
optimization in electromagnetics. IEEE Transactions on Magnetics 41(5), 1736–1739
(2005)
8. Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H., Cherry, J., Davis, A., Dolinski,
K., Dwight, S., Eppig, J.: Gene ontology: tool for the unification of biology. Nature Genet-
ics 25, 25–29 (2000)
9. Dwight, S., Harris, M., Dolinski, K., Ball, C., Binkley, G., Christie, K., Fisk, D., Issel
Tarv-er, L., Schroeder, M., Sherlock, G.: Saccharomyces Genome Database (SGD)
provides sec-ondary gene annotation using the Gene Ontology (GO). Nucleic Acids Re-
search 30, 69–72 (2012)
10. Chowdhury, A., Konar, A., Rakshit, P., Janarthanan, R.: Protein Function Prediction Using
Adaptive Swarm Based Algorithm. SEMCCO 2, 55–68 (2013)
11. Stark, C., Breitkreutz, B.J., Reguly, T., Boucher, L., Breitkreutz, A., Tyers, M.: BioGRID:
a general repository for interaction datasets. Nucleic Acids Res. 34, D535–D539 (2006)
12. Yang, X.-S.: Firefly algorithms for multimodal optimization. In: Watanabe, O., Zeugmann,
T. (eds.) SAGA 2009. LNCS, vol. 5792, pp. 169–178. Springer, Heidelberg (2009)
13. Kennedy, J.: Particle swarm optimization. In: Encyclopedia of Machine Learning, pp.
760–766 (2010)
14. Chua, H.N., Sung, W.-K., Wong, L.: Exploiting indirect neighbors antopological weight to
predict protein function from protein protein interactions. Bioinformatics 22(13), 1623–
1630 (2006)
Fast Approximate Eyelid Detection
for Periocular Localization
1 Introduction
Biometrics, as the name suggests, refers to certain characteristics of the human
body. These characteristics are unique to every individual. The primary use
of these traits or characteristics is in the identification and authentication of
individuals [3]. In the field of computer science, biometrics is used as a means
to maintain the identification process and allow access control. Biometrics is
also an important tool for identifying individuals that are under surveillance.
The biometric identifiers are often categorized into physiological and behavioral
features. Physiological features are the features that are involved with the shape
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 697
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_80, c Springer International Publishing Switzerland 2014
698 S.Z.N. Hazarika et al.
and size of the human body. Some examples include face, fingerprint, DNA, iris
recognition etc. The behavioral features, on the other hand, are related to the
pattern of behavior of a person, which include the person’s speech, voice texture,
gait etc. A good biometric trait is characterized by the use of the features that
are highly unique, stable, easy to capture and prevents circumvention.
Out of all the biometric features available, only the iris is considered to be
the most reliable. The reason is that reliability is particularly dependent on the
ability to detect unique features easily that can be extracted and which remains
invariant over a period of time. However, capturing of image of the iris takes place
in a constrained environment. The various constraints include looking into the
scanner of the iris camera, capturing of images from a distance, non-cooperative
subjects etc. Also, one of the major problems is that the image of the iris is
captured in the near infrared region (NIR). This limits the merits of the iris
recognition method in the sense that the NIR wavelength does not allow any
image to be captured in an outdoor environment. This leads to a possibility
that a very good quality image of the iris might not be captured. This is due
to the occurrence of occlusion i.e. some parts of the required features might be
hidden due to the presence of spectacles or caps, or the presence of any other
noisy factors. In such situations it might be possible to capture and recognize
an individual through the image of the region around the eye.
2 Related Works
To counter the challenges posed for unconstrained iris detection for identifica-
tion of an individual, the region around the iris (periocular region) has been
considered as an alternative trait. Features from regions outside the iris like
blood vessels, eyelids, eyelashes, skin textures thus contribute as an integral
part of periocular identification process. Various researchers have proposed sev-
eral methods for the purpose of separating the eyelids from the rest of the image
so that it can be used as a biometric feature. An overview of the related works
by other researchers have been tabulated in Table 1. Masek [8] has proposed a
method for the eyelid segmentation in which the iris and the eyelids have been
separated through Hough transformation. In [14], Xu et al. have segmented out
the upper and lower eyelid candidate region into 8 sub-blocks. Out of these 8
sub-blocks, the proper eyelid-eyelash model has been chosen based on maximum
deviation from each block. This approach however bears a disadvantage. If the
occluded region is wider than the region defined as eyelid-eyelash region in each
block, then this approach produces wrong position of the features leading to
mislocalization. In contrast to [8], Mahlouji and Noruzi [7] have used Hough
transform to localize all the boundaries between the upper and lower eyelid re-
gions. This method in [7] has a relatively lower mislocalization rate than [8],
while the processing time of the former is also of the same order as the latter.
He et al. in their work [5], have proposed a novel method for the eyelash, eyelid,
and shadow localization. In [12], Thalji and Alsmadi have proposed an eyelid
segmentation method wherein the pupil is detected first, and the limbic bound-
ary is subsequently localized. After the localization of the limbic boundary, this
algorithm successfully extracts the iris region excluding the eyelid and eyelashes
and thus avoids occlusion. Adam et al. in [1] have presented two cumulative
distribution functions to estimate the performance of the upper and lower eyelid
segmentations, which yields < 10% mislocalization for 90% of the upper eyelids
and 98% of the lower eyelids. Cui et al. [4] have proposed two algorithms for
the detection of upper and lower eyelids and tested on CASIA dataset. This
technique has been successful in detecting the edge of upper and lower eyelids
after the eyelash and eyelids were segmented. Ling et al. proposed an algorithm
capable of segmenting the iris from images of very low quality [6] through eyelid
detection. The algorithm described in [6] being an unsupervised one, does not
require a training phase. Radman et al. have used the live-wire technique to lo-
calize the eyelid boundaries based on the intersection points between the eyelids
and the outer iris boundary [11].
and Ali Noruzi [7] proposed method not only has relatively higher precision and but also compares with popular available
methods in terms of processing time. (database used- CASIA)
2008 Zhaofeng He et al. An accurate and fast method for eyelash, eyelid and shadow localization is implemented which pro-
[5] vides a novel prediction model for determining proper threshold for eyelash and shadow detection.
2013 Zyad Thalji and The proposed iris image segmentation algorithm consists of separate modules for pupil detection, lim-
Mutasem Alsmadi bic bounary localization and eyelid/eyelash detection. The proposed algorithm successfully extracted
[12] the iris print from the images and also avoided the eyelash and eyelid which makes the noise.
2008 Mathieu Adam et Two cumulative distribution functions are presented to estimate the performances of upper and lower
al. [1] eyelid segmentation.
2004 Jiali Cui et al. [4] The two algorithms are proposed for: Upper eyelid localization and lower eyelid detection. Edge of
the upper and lower eyelids are detected after the eyelids and the eyelashes are segmented. (Database
used : CASIA Iris database (version 1.0))
2010 Lee Luang Ling The algorithm is capable of segmenting the iris images of a very low quality. The major methods
and Daniel Felix de involved in this iris segmentation approach were: Pupillary detection, limbic boundary detection and
Brito [6] eyelid/eyelash detection. Interesting feature of this algorithm is no training feature is required.
2013 Abduljalil Radman The live-wire technique has been utilized to localize eyelid boundaries based on the intersection points
et al. [11] between the eyelid and the outer iris boundary.
Fast Approximate Eyelid Detection for Periocular Localization 701
separately. In the first step, the eye image is smoothed to suppress the low-
gradient edges. Wiener low-pass filter applied along the 3 × 3 neighborhood of
every pixel is employed to smooth the image. The window size is chosen em-
pirically as it suppresses negligible edges and retains strong edges as required
by the proposed system. The smoothed image is subjected to an edge detection
technique to find edge map existing in the image. Sobel, Prewitt, Roberts, Lapla-
cian of Gaussian, zero cross, Canny or any other existing suitable edge detection
technique can be used that efficiently detects edges. In implementation, we have
used Canny edge detector with standard deviation of underlying Gaussian Filter
as 1. Subsequently for every detected edge, a horizontality factor is calculated.
This horizontality factor (denoted by hf ) denotes how much horizontal an edge
is through a fractional value lying within [0,1]. hf for i-th edge (denoted by hfi )
calculated through Equation 1.
Maximum x coordinate of i th edge − Minimum x coordinate of i th edge
hfi = (1)
Number of unique pixels belonging to i th edge
The horizontal factor of all edges that are mostly horizontal will be high (close
to 1) and the same will be low for the edges that are vertical (close to 0). A
threshold value is calculated by use of this horizontality function by Equation 2.
thresh = min(hf ) + 0.1 × (max(hf ) − min(hf )) (2)
4 Experimental Results
Fig. 2. Proposed periocular localization algorithm : (a) Coloured input image; (b)
Noisy image of the iris; (c) Grayscale image of the iris; (d) Smoothed image of the iris;
(e) Detection of edges of the image; (f) Horizontal edges retained; (g) Upper and lower
eyelids detected through checking hf ; (h) 2-means clustering performed on detected
eyelid points; (i) Periocular region localized
Fast Approximate Eyelid Detection for Periocular Localization 703
5 Conclusion
Iris is considered to be one of the most reliable biometric traits, making it one
of the most widely used in the present biometric scenario. However, in cases
where unconstrained images are acquired, iris recognition fails to deliver desired
accuracy. Hence, to achieve recognition in unconstrained images, the periocular
region is considered as an alternative. Upon implementing the proposed eyelid
detection algorithm on publicly available visual spectrum (VS) image database
UBIRISv1 and UBIRISv2, it is observed that both the upper and lower eyelids
show partial localization for most of the images. Even with the partial success in
detecting the eyelids, we are able to approximately localize the periocular region.
References
1. Adam, M., Rossant, F., Amiel, F., Mikovikova, B., Ea, T.: Eyelid Localization for
Iris Identification. Radioengineering 17(4), 82–85 (2008)
2. Bakshi, S., Sa, P.K., Majhi, B.: Optimised periocular template selection for human
recognition. BioMed Research International 2013, 1–14 (2013)
3. Bakshi, S., Tuglular, T.: Security through human-factors and biometrics. In: 6th
International Conference on Security of Information and Networks, pp. 463–463
(2013)
4. Cui, J., Wang, Y., Tan, T., Ma, L., Sun, Z.: A Fast and Robust Iris Localization
Method Based on Texture Segmentation. In: Biometric Authentication and Testing,
National Laboratory of Pattern Recognition, Chinese Academy of Sciences (2004)
5. He, Z., Tan, T., Sun, Z., Qiu, X.: Robust eyelid eyelash and shadow localization
for iris recognition. In: 15th IEEE International Conference on Image Processing,
pp. 265–268 (2008)
6. Ling, L.L., de Brito, D.F.: Fast and efficient iris image segmentation. Journal of
Medical and Biological Engineering 30(6), 381–392 (2010)
704 S.Z.N. Hazarika et al.
7. Mahlouji, M., Noruzi, A.: Human Iris Segmentation for Iris Recognition in Un-
constrained Environments. IJCSI International Journal of Computer Science Is-
sues 9(1), 3, 149–155 (2012)
8. Masek, L.: Recognition of Human Iris Patterns for Biometric Identification. In:
Bachelor of Engineering Thesis at The University of Western Australia (2003)
9. Proença, H., Alexandre, L.A.: UBIRIS: A noisy iris image database. In: Roli, F.,
Vitulano, S. (eds.) ICIAP 2005. LNCS, vol. 3617, pp. 970–977. Springer, Heidelberg
(2005)
10. Proena, H., Filipe, S., Santos, R., Oliveira, J., Alexandre, L.: The UBIRISv2: A
database of visible wavelength iris images captured on-the-move and at-a-distance.
IEEE Transactions on Pattern Analysis and Machine Intelligence 32(8), 1529–1535
(2010)
11. Radman, A., Zainal, N., Ismail, M.: Efficient Iris Segmentation based on eyelid
detection. Journal of Engineering Science and Technology 8(4), 399–405 (2013)
12. Thalji, Z., Alsmadi, M.: Iris Recognition Using Robust Algorithm for Eyelid, Eye-
lash and Shadow Avoiding. World Applied Sciences Journal 25(6), 858–865 (2013)
13. Valentina, C., Hartono, R.N., Tjahja, T.V., Nugroho, A.S.: Iris Localization using
Circular Hough Transform and Horizontal Projection Folding. In: Proceedings of
International Conference on Information Technology and Applied Mathematics
(2012)
14. Xu, G., Zhang, Z., Ma, Y.: Improving the Performance of Iris Recognition System
Using Eyelids and Eyelashes Detection and Iris Image Enhancement. In: 5th IEEE
conference on Cognitive Informatics, pp. 871–876 (2006)
Prediction of an Optimum Parametric Combination
for Minimum Thrust Force in Bone Drilling:
A Simulated Annealing Approach
1 Introduction
*
Corresponding author.
M.K. Kundu et al. (eds.), Advanced Computing, Networking and Informatics - Volume 1, 705
Smart Innovation, Systems and Technologies 27,
DOI: 10.1007/978-3-319-07353-8_81, © Springer International Publishing Switzerland 2014
706 R.K. Pandey and S.S. Panda
thrust force generated during bone drilling will result in better fixation of the broken
bones and their quick recovery postoperatively.
Previously, many researchers have investigated the bone drilling process to study
the effect of the various drilling parameters on the thrust force produced [1-4]. The
early researches in this area were reported in late 1950s [3]. Spindle speed and feed
rate were the main drilling parameters analyzed in most of the studies [3-4, 10-14].
There is a consensus among the researchers that the thrust force decreases with an
increase in spindle speed [3-4, 10-14]. But, the drilling of bone with higher spindle
speeds were reported with increased trauma [3, 14]. The analysis on the effect of the
feed rate showed that the increase in feed rate increases the thrust force induced in
bone drilling [1-2, 4, 13]. Despite of the above mentioned studies, there is a lack of a
clear suggestion on the optimal settings of the feed rate and spindle speed for mini-
mum thrust force generation during bone drilling.
In this work, a statistical model for bone drilling process to predict the thrust force
as a function of feed rate (mm/min) and spindle speed (rpm) is developed using re-
sponse surface methodology (RSM). Next, the model is used as a fitness function in
SA algorithm to determine the optimal setting of feed rate and spindle speed for min-
imum thrust force during bone drilling. The adopted approach is then validated
through the confirmation experiment.
RSM is a collection of mathematical and statistical tools which are easy, quick and
effective for modeling the process in which several variables influence the response
of interest [15-16]. In most of the real problems the relationship between the re-
sponse and the independent variable is unknown. In RSM the relationship between the
response and the independent process variables is represented as (1)
Y = f ( A, B, C ) + ε (1)
Where Y is the desired response, f is the response function and ε represents the
error observed in the response. A second order model is generally employed if the
response function is nonlinear or not known, shown in (2) [15-16]
k k
Y = β 0 + β i xi + βii xi2 + β ij xi x j + ε (2)
i =1 i =1 i j
Where β 0 is the coefficient for constant term β i , β ii and β ij are the coefficients
for linear, square and interaction terms respectively.
Simulated annealing algorithm mimics the process of annealing which involves
heating of a metal to a temperature beyond its melting point followed by a gradual
cooling. In molten state, the particles are in random motion and when it is cooled
gradually the particles rearrange themselves to attain minimum energy state. As the
cooling is done gradually, lower and lower energy states are obtained until the lowest
energy state is reached [17]. In the process of annealing the probability Pr (E ) of be-
ing at energy state is given by the Boltzmann distribution as (3):
1
Pr ( E ) = (3)
Z (T ) exp(− E / KT )
Prediction of an Optimum Parametric Combination for Minimum Thrust Force 707
This criterion of acceptance for the new state is known as the Metropolis accep-
tance criteria. Kirkpatrick et al. [19] used the sequence of Metropolis algorithm eva-
luated by the sequence of reducing temperatures as simulated annealing minimization
of an objective function. The objective function corresponds to the energy function
used in Metropolis acceptance criteria. Recently, SA has been used successfully for
the optimization of the various engineering problems [20-21]. It uses a number of
points ( N ) to test the thermal equilibrium at a particular temperature before acquiring
a reduced temperature state. The algorithm is stopped when the desirable minimum
change in the value of the objective function is obtained or the temperature obtained
is sufficiently small. The initial starting temperature T and the number of iterations
N to be performed at each temperature are the two user defined parameters that
governs the effective working of SA algorithm.
The step by step procedure of SA algorithm is discussed below [17].
• Initialize the starting temperature T and the termination temperature Tmin . Ran-
domly generate initial starting point X . Also define the number of iterations N to
be performed at each Temperature. Set the iteration counter as i = 0
• Calculate the value of the objective function E = f ( X ) .
• Update E Best = E and X Best = X
• Generate the random neighborhood point X ∗ and calculate the objective function
E∗ = f ( X ∗)
• Evaluate the change in energy ΔE = E ∗ − E
• If the change in energy ΔE < 0 , then update the point X = X ∗ and if E < E best ,then
E best = E and X Best = X .Update the iteration as i = i + 1 and go to next step. Else
go to step number 3 and repeat the process.
• If i > N go to the next step.
• Reduce the temperature by a factor α and update T = αT .
• If T ≤ Tmin then terminate the process and print X best and E best else, move to step 3.
708 R.K. Pandey and S.S
S. Panda
2 Experimental Prrocedure
The bone drilling parameteers considered are feed rate (mm/min) and spindle sppeed
(rpm) (shown in Table 1) anda the response taken is thrust force (N). The central coom-
posite design (CCD) of RS SM was employed to design the plan of experiments for
studying the relationship beetween the response and the bone drilling parameters. F Full
factorial design for factors at two levels i.e. high (+1) and low (-1) corresponding tto a
face centered design with thhirteen runs (four factorial points, four axial points and ffive
central points) was used as shown in Table 2. The bone drilling parameters and thheir
levels are considered based on the wide range of experiments reported in the literatture
[1-4, 10-14].
Table 1. Factor
F and levels considered for bone drilling.
Control facto
or Low level (-1) High level (+
+1)
A Feed rate (mmm/min) 30 150
B Spindle speedd (rpm) 500 2500
The work material used fo or conducting the bone drilling experiments was bovvine
femur, as the human bones are not easily available and it closely resembles the hum
man
bone, allowing the results to be extrapolated in the real surgical situations [19-220].
The bovine femur was obttained from a local slaughter house immediately after the
slaughter and the experiments were done within few hours to maintain minimum lloss
in thermo-physical propertiies of the fresh bone [22-23]. No animal was scarified sspe-
cifically for the purpose of this
t research.
The experiments were carried out on 3 axis MTAB flex mill using 4.5mm HSS
(high speed steel) drill bit. The drilling depth was 6mm. The drilling thrust force sig-
nals were measured using Kistler 9257B piezo electric dynamometer. The signals
were acquired using 5070 multichannel charge amplifier and Dynoware software. The
thrust force obtained for each experimental run is listed in the last column of Table
2.The experimental set up is shown in the Fig. 1.
Analysis of variance (ANOVA) carried out to find the significance of the devel-
oped model and individual model coefficients at 95% confidence interval is shown in
Table 4.
Where
DF= Degree of freedom
SS= Sum of squares
MS= Mean square
From the ANOVA table it can be seen that the model is significant as its p value is
less than 0.0500. In this case A, B, AB, B2 are significant model terms. Values great-
er than 0.1000 indicate the model terms are not significant [24]. The comparison of
the predicted thrust force values with the actual values is shown in the Fig. 2.
Fig. 2. Comparison between the predicted and the actual thrust force
The optimal setting of the spindle speed and feed rate for minimum thrust force dur-
ing bone drilling is determined using SA. The developed response surface model of
Prediction of an Optimum Parametric Combination for Minimum Thrust Force 711
Boltzmann constant K =1
Cooling rate α = 0.95
Number of cycles at each temperature N = 50
The problem formulation is subjected to the boundaries (limitations) of the drilling
parameters and is stated as follows:
30≤ Feed rate ≤ 150
500 ≤ Spindle speed≤ 2500
The result obtained by SA is listed in Table 5. Fig. 3 shows the variation of the
function value with number of iterations. From Table 5, it can be observed that the
minimum value of thrust force is 3.6364 N with the feed rate of 30 (mm/min) and
spindle speed of 1820.845 (rpm).
Parameters Value
Minimum fitness function thrust force 3.6364 (N)
Optimal cutting conditions
Feed rate 30
Spindle speed 1820.845
To validate the result obtained from the above analysis, confirmation experiments
were carried out. The result of the confirmation experiments are shown in Table 6.
Four experiments were carried out within the range of the parameters studied for
the confirmation of the obtained results. Three settings of feed rate and spindle speed
were selected randomly whereas one optimal setting predicted by SA was analyzed.
From Table 6 it is clear that the predicted values and those obtained from experiment
are very close hence, RSM model can effectively predict the thrust force values whe-
reas SA can be very useful to minimize the thrust force during bone drilling.
5 Conclusions
• The developed response surface model can effectively predict the thrust force in
bone drilling within the range of the parameters studied.
• SA results showed that the best combination of bone drilling parameters for mini-
mum thrust force is 30 mm/min of feed rate and 1820 rpm of spindle speed.
• The results of the confirmation experiments validated that the combination of RSM
and SA is suitable for optimizing the bone drilling process.
• The use of above approach can greatly assist the orthopaedic surgeons to decide
the best level of drilling parameters for bone drilling with minimum mechanical
damage.
References
1. Pandey, R.K., Panda, S.S.: Drilling of bone: A comprehensive review. Journal of Clinical
Orthopedics and Trauma 4, 15–30 (2013)
2. Lee, J., Gozen, A.B., Ozdoganlar, O.B.: Modeling and experimentation of bone drilling
forces. Journal of Biomechanics 45, 1076–1083 (2012)
3. Thompson, H.C.: Effect of drilling into bone. Journal of Oral Surgery 16, 22–30 (1958)
4. Wiggins, K.L., Malkin, S.: Drilling of bone. Journal of Biomechanics 9, 553–559 (1976)
Prediction of an Optimum Parametric Combination for Minimum Thrust Force 713
5. Brett, P.N., Baker, D.A., Taylor, R., Griffiths, M.V.: Controlling the penetration of flexible
bone tissue using the stapedotomy micro drill. Proceedings of the Institution of Mechanical
Engineers, Part I: Journal of Systems and Control Engineering 218, 343–351 (2004)
6. Kendoff, D., Citak, M., Gardner, M.J., Stubig, T., Krettek, C., Hufner, T.: Improved accu-
racy of navigated drilling using a drill alignment device. Journal of Orthopaedic Re-
search 25, 951–957 (2007)
7. Ong, F.R., Bouazza-Marouf, K.: The detection of drill-bit break-through for the enhance-
ment of safety in mechatronic assisted orthopaedic drilling. Mechatronics 9, 565–588
(1999)
8. Price, M., Molloy, S., Solan, M., Sutton, A., Ricketts, D.M.: The rate of instrument brea-
kage during orthopaedic procedures. International Orthopedics 26, 185–187 (2002)
9. Bassi, J.L., Pankaj, M., Navdeep, S.: A technique for removal of broken cannulated drill-
bit: Bassi’s method. Journal of Orthopaedic Trauma 22, 56–58 (2008)
10. Augustin, G., Davila, S., Mihoci, K., Udiljak, T., Vedrina, D.S., Antabak, A.: Thermal os-
teonecrosis and bone drilling parameters revisited. Archives of Orthopaedic and Trauma
Surgery 128, 71–77 (2008)
11. Abouzgia, M.B., James, D.F.: Temperature rise during drilling through bone. International
Journal of Oral and Maxillofacial Implants 12, 342–3531 (1997)
12. Hobkirk, J.A., Rusiniak, K.: Investigation of variable factors in drilling bone. Journal of
Oral and Maxillofacial Surgery 35, 968–973 (1977)
13. Jacobs, C.H., Berry, J.T., Pope, M.H., Hoaglund, F.T.: A study of the bone machining
process-drilling. Journal of Biomechanics 9, 343–349 (1976)
14. Albrektsson, T.: Measurements of shaft speed while drilling through bone. Journal of Oral
and Maxillofacial Surgery 53, 1315–1316 (1995)
15. Myers, R.H., Montgomery, D.C.: Response surface methodology, 2nd edn. Wiley, New
York (2002) ISBN 0-471-41255-4
16. Box, G.E.P., Hunter, J.S., Hunter, W.G.: Statistics for experimenters, 2nd edn. Wiley, New
York (2005) ISBN 13978-0471-71813-0
17. Somashekhar, K.P., Mathew, J., Ramachandran, N.: A feasibility approach by simulated
annealing on optimization of micro-wire electric discharge machining parameters. Int. J.
Adv. Manuf. Technol. 61, 1209–1213 (2012)
18. Metropolis, N., Rosenbluth, A., Rosenbluth, N., Teller, A., Teller, E.: Equation of state
calculation by fast computing machines. J. Chem. Phys. 21, 1087–1092 (1953)
19. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing.
Science 220, 671–680 (1983)
20. Glover, F., Gary, A.K.: Hand book of metaheuristics. Kluwer, London (2003)
21. Van, P.J., Laarhoven, E.H.A.: Simulated annealing: theory and applications. Kluwer,
London (1987)
22. Karaca, F., Aksakal, B., Kom, M.: Influence of orthopaedic drilling parameters on temper-
ature and histopathology of bovine tibia: An in vitro study. Medical Engineering & Phys-
ics 33(10), 1221–1227 (2011)
23. Lee, J., Ozdoganlar, O.B., Rabin, Y.: An experimental investigation on thermal exposure
during bone drilling. Medical Engineering & Physics 34(10), 1510–1520 (2012)
24. Design Expert, http://www.statease.com/dx8descr.html
Author Index