Ijcsi Vol 8 Issue 3 No 1
Ijcsi Vol 8 Issue 3 No 1
IJCSI PUBLICATION
www.IJCSI.org
In this third edition of 2011, we bring forward issues from various dynamic computer science
fields ranging from system performance, computer vision, artificial intelligence, software
engineering, multimedia, pattern recognition, information retrieval, databases, security and
networking among others.
As always we thank all our reviewers for providing constructive comments on papers sent to
them for review. This helps enormously in improving the quality of papers published in this
issue.
Google Scholar reported a large amount of cited papers published in IJCSI. We will continue to
encourage the readers, authors and reviewers and the computer science scientific community and
interested authors to continue citing papers published by the journal.
It was with pleasure and a sense of satisfaction that we announced in mid March 2011 our 2-year
Impact Factor which is evaluated at 0.242. For more information about this please see the FAQ
section of the journal.
Apart from availability of the full-texts from the journal website, all published papers are
deposited in open-access repositories to make access easier and ensure continuous availability of
the proceedings free of charge for all researchers.
We are pleased to present IJCSI Volume 8, Issue 3, No 1, May 2011 (IJCSI Vol. 8, Issue 3, No.
1). The acceptance rate for this issue is 33.6%.
Dr Tristan Vanrullen
Chief Editor
LPL, Laboratoire Parole et Langage - CNRS - Aix en Provence, France
LABRI, Laboratoire Bordelais de Recherche en Informatique - INRIA - Bordeaux, France
LEEE, Laboratoire d'Esthtique et Exprimentations de l'Espace - Universit d'Auvergne, France
Dr Constantino Malagn
Associate Professor
Nebrija University
Spain
Dr Mokhtar Beldjehem
Professor
Sainte-Anne University
Halifax, NS, Canada
Dr Pascal Chatonnay
Assistant Professor
Matre de Confrences
Laboratoire d'Informatique de l'Universit de Franche-Comt
Universit de Franche-Comt
France
Dr Yee-Ming Chen
Professor
Department of Industrial Engineering and Management
Yuan Ze University
Taiwan
Dr Vishal Goyal
Assistant Professor
Department of Computer Science
Punjabi University
Patiala, India
Dr Dalbir Singh
Faculty of Information Science And Technology
National University of Malaysia
Malaysia
Dr Natarajan Meghanathan
Assistant Professor
REU Program Director
Department of Computer Science
Jackson State University
Jackson, USA
Dr Navneet Agrawal
Assistant Professor
Department of ECE,
College of Technology & Engineering,
MPUAT, Udaipur 313001 Rajasthan, India
Dr Panagiotis Michailidis
Division of Computer Science and Mathematics,
University of Western Macedonia,
53100 Florina, Greece
Dr T. V. Prasad
Professor
Department of Computer Science and Engineering,
Lingaya's University
Faridabad, Haryana, India
Dr Shishir Kumar
Department of Computer Science and Engineering,
Jaypee University of Engineering & Technology
Raghogarh, MP, India
Dr P. K. Suri
Professor
Department of Computer Science & Applications,
Kurukshetra University,
Kurukshetra, India
Dr Paramjeet Singh
Associate Professor
GZS College of Engineering & Technology,
India
Dr Shaveta Rani
Associate Professor
GZS College of Engineering & Technology,
India
Dr G. Ganesan
Professor
Department of Mathematics,
Adikavi Nannaya University,
Rajahmundry, A.P, India
Dr A. V. Senthil Kumar
Department of MCA,
Hindusthan College of Arts and Science,
Coimbatore, Tamilnadu, India
Dr Jyoteesh Malhotra
ECE Department,
Guru Nanak Dev University,
Jalandhar, Punjab, India
Dr R. Ponnusamy
Professor
Department of Computer Science & Engineering,
Aarupadai Veedu Institute of Technology,
Vinayaga Missions University, Chennai, Tamilnadu, India.
N. Jaisankar
Assistant Professor
School of Computing Sciences,
VIT University
Vellore, Tamilnadu, India
IJCSI Reviewers Committee 2011
Mr. Markus Schatten, University of Zagreb, Faculty of Organization and Informatics, Croatia
Mr. Vassilis Papataxiarhis, Department of Informatics and Telecommunications, National and
Kapodistrian University of Athens, Athens, Greece
Dr Modestos Stavrakis, University of the Aegean, Greece
Dr Fadi KHALIL, LAAS -- CNRS Laboratory, France
Dr Dimitar Trajanov, Faculty of Electrical Engineering and Information technologies, ss. Cyril and
Methodius Univesity - Skopje, Macedonia
Dr Jinping Yuan, College of Information System and Management,National Univ. of Defense Tech.,
China
Dr Alexis Lazanas, Ministry of Education, Greece
Dr Stavroula Mougiakakou, University of Bern, ARTORG Center for Biomedical Engineering
Research, Switzerland
Dr Cyril de Runz, CReSTIC-SIC, IUT de Reims, University of Reims, France
Mr. Pramodkumar P. Gupta, Dept of Bioinformatics, Dr D Y Patil University, India
Dr Alireza Fereidunian, School of ECE, University of Tehran, Iran
Mr. Fred Viezens, Otto-Von-Guericke-University Magdeburg, Germany
Dr. Richard G. Bush, Lawrence Technological University, United States
Dr. Ola Osunkoya, Information Security Architect, USA
Mr. Kotsokostas N.Antonios, TEI Piraeus, Hellas
Prof Steven Totosy de Zepetnek, U of Halle-Wittenberg & Purdue U & National Sun Yat-sen U,
Germany, USA, Taiwan
Mr. M Arif Siddiqui, Najran University, Saudi Arabia
Ms. Ilknur Icke, The Graduate Center, City University of New York, USA
Prof Miroslav Baca, Faculty of Organization and Informatics, University of Zagreb, Croatia
Dr. Elvia Ruiz Beltrn, Instituto Tecnolgico de Aguascalientes, Mexico
Mr. Moustafa Banbouk, Engineer du Telecom, UAE
Mr. Kevin P. Monaghan, Wayne State University, Detroit, Michigan, USA
Ms. Moira Stephens, University of Sydney, Australia
Ms. Maryam Feily, National Advanced IPv6 Centre of Excellence (NAV6) , Universiti Sains Malaysia
(USM), Malaysia
Dr. Constantine YIALOURIS, Informatics Laboratory Agricultural University of Athens, Greece
Mrs. Angeles Abella, U. de Montreal, Canada
Dr. Patrizio Arrigo, CNR ISMAC, italy
Mr. Anirban Mukhopadhyay, B.P.Poddar Institute of Management & Technology, India
Mr. Dinesh Kumar, DAV Institute of Engineering & Technology, India
Mr. Jorge L. Hernandez-Ardieta, INDRA SISTEMAS / University Carlos III of Madrid, Spain
Mr. AliReza Shahrestani, University of Malaya (UM), National Advanced IPv6 Centre of Excellence
(NAv6), Malaysia
Mr. Blagoj Ristevski, Faculty of Administration and Information Systems Management - Bitola,
Republic of Macedonia
Mr. Mauricio Egidio Canto, Department of Computer Science / University of So Paulo, Brazil
Mr. Jules Ruis, Fractal Consultancy, The netherlands
Mr. Mohammad Iftekhar Husain, University at Buffalo, USA
Dr. Deepak Laxmi Narasimha, Department of Software Engineering, Faculty of Computer Science and
Information Technology, University of Malaya, Malaysia
Dr. Paola Di Maio, DMEM University of Strathclyde, UK
Dr. Bhanu Pratap Singh, Institute of Instrumentation Engineering, Kurukshetra University
Kurukshetra, India
Mr. Sana Ullah, Inha University, South Korea
Mr. Cornelis Pieter Pieters, Condast, The Netherlands
Dr. Amogh Kavimandan, The MathWorks Inc., USA
Dr. Zhinan Zhou, Samsung Telecommunications America, USA
Mr. Alberto de Santos Sierra, Universidad Politcnica de Madrid, Spain
Dr. Md. Atiqur Rahman Ahad, Department of Applied Physics, Electronics & Communication
Engineering (APECE), University of Dhaka, Bangladesh
Dr. Charalampos Bratsas, Lab of Medical Informatics, Medical Faculty, Aristotle University,
Thessaloniki, Greece
Ms. Alexia Dini Kounoudes, Cyprus University of Technology, Cyprus
Dr. Jorge A. Ruiz-Vanoye, Universidad Jurez Autnoma de Tabasco, Mexico
Dr. Alejandro Fuentes Penna, Universidad Popular Autnoma del Estado de Puebla, Mxico
Dr. Ocotln Daz-Parra, Universidad Jurez Autnoma de Tabasco, Mxico
Mrs. Nantia Iakovidou, Aristotle University of Thessaloniki, Greece
Mr. Vinay Chopra, DAV Institute of Engineering & Technology, Jalandhar
Ms. Carmen Lastres, Universidad Politcnica de Madrid - Centre for Smart Environments, Spain
Dr. Sanja Lazarova-Molnar, United Arab Emirates University, UAE
Mr. Srikrishna Nudurumati, Imaging & Printing Group R&D Hub, Hewlett-Packard, India
Dr. Olivier Nocent, CReSTIC/SIC, University of Reims, France
Mr. Burak Cizmeci, Isik University, Turkey
Dr. Carlos Jaime Barrios Hernandez, LIG (Laboratory Of Informatics of Grenoble), France
Mr. Md. Rabiul Islam, Rajshahi university of Engineering & Technology (RUET), Bangladesh
Dr. LAKHOUA Mohamed Najeh, ISSAT - Laboratory of Analysis and Control of Systems, Tunisia
Dr. Alessandro Lavacchi, Department of Chemistry - University of Firenze, Italy
Mr. Mungwe, University of Oldenburg, Germany
Mr. Somnath Tagore, Dr D Y Patil University, India
Ms. Xueqin Wang, ATCS, USA
Dr. Borislav D Dimitrov, Department of General Practice, Royal College of Surgeons in Ireland,
Dublin, Ireland
Dr. Fondjo Fotou Franklin, Langston University, USA
Dr. Vishal Goyal, Department of Computer Science, Punjabi University, Patiala, India
Mr. Thomas J. Clancy, ACM, United States
Dr. Ahmed Nabih Zaki Rashed, Dr. in Electronic Engineering, Faculty of Electronic Engineering,
menouf 32951, Electronics and Electrical Communication Engineering Department, Menoufia university,
EGYPT, EGYPT
Dr. Rushed Kanawati, LIPN, France
Mr. Koteshwar Rao, K G Reddy College Of ENGG.&TECH,CHILKUR, RR DIST.,AP, India
Mr. M. Nagesh Kumar, Department of Electronics and Communication, J.S.S. research foundation,
Mysore University, Mysore-6, India
Dr. Ibrahim Noha, Grenoble Informatics Laboratory, France
Mr. Muhammad Yasir Qadri, University of Essex, UK
Mr. Annadurai .P, KMCPGS, Lawspet, Pondicherry, India, (Aff. Pondicherry Univeristy, India
Mr. E Munivel , CEDTI (Govt. of India), India
Dr. Chitra Ganesh Desai, University of Pune, India
Mr. Syed, Analytical Services & Materials, Inc., USA
Mrs. Payal N. Raj, Veer South Gujarat University, India
Mrs. Priti Maheshwary, Maulana Azad National Institute of Technology, Bhopal, India
Mr. Mahesh Goyani, S.P. University, India, India
Mr. Vinay Verma, Defence Avionics Research Establishment, DRDO, India
Dr. George A. Papakostas, Democritus University of Thrace, Greece
Mr. Abhijit Sanjiv Kulkarni, DARE, DRDO, India
Mr. Kavi Kumar Khedo, University of Mauritius, Mauritius
Dr. B. Sivaselvan, Indian Institute of Information Technology, Design & Manufacturing,
Kancheepuram, IIT Madras Campus, India
Dr. Partha Pratim Bhattacharya, Greater Kolkata College of Engineering and Management, West
Bengal University of Technology, India
Mr. Manish Maheshwari, Makhanlal C University of Journalism & Communication, India
Dr. Siddhartha Kumar Khaitan, Iowa State University, USA
Dr. Mandhapati Raju, General Motors Inc, USA
Dr. M.Iqbal Saripan, Universiti Putra Malaysia, Malaysia
Mr. Ahmad Shukri Mohd Noor, University Malaysia Terengganu, Malaysia
Mr. Selvakuberan K, TATA Consultancy Services, India
Dr. Smita Rajpal, Institute of Technology and Management, Gurgaon, India
Mr. Rakesh Kachroo, Tata Consultancy Services, India
Mr. Raman Kumar, National Institute of Technology, Jalandhar, Punjab., India
Mr. Nitesh Sureja, S.P.University, India
Dr. M. Emre Celebi, Louisiana State University, Shreveport, USA
Dr. Aung Kyaw Oo, Defence Services Academy, Myanmar
Mr. Sanjay P. Patel, Sankalchand Patel College of Engineering, Visnagar, Gujarat, India
Dr. Pascal Fallavollita, Queens University, Canada
Mr. Jitendra Agrawal, Rajiv Gandhi Technological University, Bhopal, MP, India
Mr. Ismael Rafael Ponce Medelln, Cenidet (Centro Nacional de Investigacin y Desarrollo
Tecnolgico), Mexico
Mr. Supheakmungkol SARIN, Waseda University, Japan
Mr. Shoukat Ullah, Govt. Post Graduate College Bannu, Pakistan
Dr. Vivian Augustine, Telecom Zimbabwe, Zimbabwe
Mrs. Mutalli Vatila, Offshore Business Philipines, Philipines
Mr. Pankaj Kumar, SAMA, India
Dr. Himanshu Aggarwal, Punjabi University,Patiala, India
Dr. Vauvert Guillaume, Europages, France
Prof Yee Ming Chen, Department of Industrial Engineering and Management, Yuan Ze University,
Taiwan
Dr. Constantino Malagn, Nebrija University, Spain
Prof Kanwalvir Singh Dhindsa, B.B.S.B.Engg.College, Fatehgarh Sahib (Punjab), India
Mr. Angkoon Phinyomark, Prince of Singkla University, Thailand
Ms. Nital H. Mistry, Veer Narmad South Gujarat University, Surat, India
Dr. M.R.Sumalatha, Anna University, India
Mr. Somesh Kumar Dewangan, Disha Institute of Management and Technology, India
Mr. Raman Maini, Punjabi University, Patiala(Punjab)-147002, India
Dr. Abdelkader Outtagarts, Alcatel-Lucent Bell-Labs, France
Prof Dr. Abdul Wahid, AKG Engg. College, Ghaziabad, India
Mr. Prabu Mohandas, Anna University/Adhiyamaan College of Engineering, india
Dr. Manish Kumar Jindal, Panjab University Regional Centre, Muktsar, India
Prof Mydhili K Nair, M S Ramaiah Institute of Technnology, Bangalore, India
Dr. C. Suresh Gnana Dhas, VelTech MultiTech Dr.Rangarajan Dr.Sagunthala Engineering
College,Chennai,Tamilnadu, India
Prof Akash Rajak, Krishna Institute of Engineering and Technology, Ghaziabad, India
Mr. Ajay Kumar Shrivastava, Krishna Institute of Engineering & Technology, Ghaziabad, India
Mr. Deo Prakash, SMVD University, Kakryal(J&K), India
Dr. Vu Thanh Nguyen, University of Information Technology HoChiMinh City, VietNam
Prof Deo Prakash, SMVD University (A Technical University open on I.I.T. Pattern) Kakryal (J&K),
India
Dr. Navneet Agrawal, Dept. of ECE, College of Technology & Engineering, MPUAT, Udaipur 313001
Rajasthan, India
Mr. Sufal Das, Sikkim Manipal Institute of Technology, India
Mr. Anil Kumar, Sikkim Manipal Institute of Technology, India
Dr. B. Prasanalakshmi, King Saud University, Saudi Arabia.
Dr. K D Verma, S.V. (P.G.) College, Aligarh, India
Mr. Mohd Nazri Ismail, System and Networking Department, University of Kuala Lumpur (UniKL),
Malaysia
Dr. Nguyen Tuan Dang, University of Information Technology, Vietnam National University Ho Chi
Minh city, Vietnam
Dr. Abdul Aziz, University of Central Punjab, Pakistan
Dr. P. Vasudeva Reddy, Andhra University, India
Mrs. Savvas A. Chatzichristofis, Democritus University of Thrace, Greece
Mr. Marcio Dorn, Federal University of Rio Grande do Sul - UFRGS Institute of Informatics, Brazil
Mr. Luca Mazzola, University of Lugano, Switzerland
Mr. Nadeem Mahmood, Department of Computer Science, University of Karachi, Pakistan
Mr. Hafeez Ullah Amin, Kohat University of Science & Technology, Pakistan
Dr. Professor Vikram Singh, Ch. Devi Lal University, Sirsa (Haryana), India
Mr. M. Azath, Calicut/Mets School of Enginerring, India
Dr. J. Hanumanthappa, DoS in CS, University of Mysore, India
Dr. Shahanawaj Ahamad, Department of Computer Science, King Saud University, Saudi Arabia
Dr. K. Duraiswamy, K. S. Rangasamy College of Technology, India
Prof. Dr Mazlina Esa, Universiti Teknologi Malaysia, Malaysia
Dr. P. Vasant, Power Control Optimization (Global), Malaysia
Dr. Taner Tuncer, Firat University, Turkey
Dr. Norrozila Sulaiman, University Malaysia Pahang, Malaysia
Prof. S K Gupta, BCET, Guradspur, India
Dr. Latha Parameswaran, Amrita Vishwa Vidyapeetham, India
Mr. M. Azath, Anna University, India
Dr. P. Suresh Varma, Adikavi Nannaya University, India
Prof. V. N. Kamalesh, JSS Academy of Technical Education, India
Dr. D Gunaseelan, Ibri College of Technology, Oman
Mr. Sanjay Kumar Anand, CDAC, India
Mr. Akshat Verma, CDAC, India
Mrs. Fazeela Tunnisa, Najran University, Kingdom of Saudi Arabia
Mr. Hasan Asil, Islamic Azad University Tabriz Branch (Azarshahr), Iran
Prof. Dr Sajal Kabiraj, Fr. C Rodrigues Institute of Management Studies (Affiliated to University of
Mumbai, India), India
Mr. Syed Fawad Mustafa, GAC Center, Shandong University, China
Dr. Natarajan Meghanathan, Jackson State University, Jackson, MS, USA
Prof. Selvakani Kandeeban, Francis Xavier Engineering College, India
Mr. Tohid Sedghi, Urmia University, Iran
Dr. S. Sasikumar, PSNA College of Engg and Tech, Dindigul, India
Dr. Anupam Shukla, Indian Institute of Information Technology and Management Gwalior, India
Mr. Rahul Kala, Indian Institute of Inforamtion Technology and Management Gwalior, India
Dr. A V Nikolov, National University of Lesotho, Lesotho
Mr. Kamal Sarkar, Department of Computer Science and Engineering, Jadavpur University, India
Dr. Mokhled S. AlTarawneh, Computer Engineering Dept., Faculty of Engineering, Mutah University,
Jordan, Jordan
Prof. Sattar J Aboud, Iraqi Council of Representatives, Iraq-Baghdad
Dr. Prasant Kumar Pattnaik, Department of CSE, KIST, India
Dr. Mohammed Amoon, King Saud University, Saudi Arabia
Dr. Tsvetanka Georgieva, Department of Information Technologies, St. Cyril and St. Methodius
University of Veliko Tarnovo, Bulgaria
Dr. Eva Volna, University of Ostrava, Czech Republic
Mr. Ujjal Marjit, University of Kalyani, West-Bengal, India
Dr. Prasant Kumar Pattnaik, KIST,Bhubaneswar,India, India
Dr. Guezouri Mustapha, Department of Electronics, Faculty of Electrical Engineering, University of
Science and Technology (USTO), Oran, Algeria
Mr. Maniyar Shiraz Ahmed, Najran University, Najran, Saudi Arabia
Dr. Sreedhar Reddy, JNTU, SSIETW, Hyderabad, India
Mr. Bala Dhandayuthapani Veerasamy, Mekelle University, Ethiopa
Mr. Arash Habibi Lashkari, University of Malaya (UM), Malaysia
Mr. Rajesh Prasad, LDC Institute of Technical Studies, Allahabad, India
Ms. Habib Izadkhah, Tabriz University, Iran
Dr. Lokesh Kumar Sharma, Chhattisgarh Swami Vivekanand Technical University Bhilai, India
Mr. Kuldeep Yadav, IIIT Delhi, India
Dr. Naoufel Kraiem, Institut Superieur d'Informatique, Tunisia
Prof. Frank Ortmeier, Otto-von-Guericke-Universitaet Magdeburg, Germany
Mr. Ashraf Aljammal, USM, Malaysia
Mrs. Amandeep Kaur, Department of Computer Science, Punjabi University, Patiala, Punjab, India
Mr. Babak Basharirad, University Technology of Malaysia, Malaysia
Mr. Avinash singh, Kiet Ghaziabad, India
Dr. Miguel Vargas-Lombardo, Technological University of Panama, Panama
Dr. Tuncay Sevindik, Firat University, Turkey
Ms. Pavai Kandavelu, Anna University Chennai, India
Mr. Ravish Khichar, Global Institute of Technology, India
Mr Aos Alaa Zaidan Ansaef, Multimedia University, Cyberjaya, Malaysia
Dr. Awadhesh Kumar Sharma, Dept. of CSE, MMM Engg College, Gorakhpur-273010, UP, India
Mr. Qasim Siddique, FUIEMS, Pakistan
Dr. Le Hoang Thai, University of Science, Vietnam National University - Ho Chi Minh City, Vietnam
Dr. Saravanan C, NIT, Durgapur, India
Dr. Vijay Kumar Mago, DAV College, Jalandhar, India
Dr. Do Van Nhon, University of Information Technology, Vietnam
Mr. Georgios Kioumourtzis, University of Patras, Greece
Mr. Amol D.Potgantwar, SITRC Nasik, India
Mr. Lesedi Melton Masisi, Council for Scientific and Industrial Research, South Africa
Dr. Karthik.S, Department of Computer Science & Engineering, SNS College of Technology, India
Mr. Nafiz Imtiaz Bin Hamid, Department of Electrical and Electronic Engineering, Islamic University
of Technology (IUT), Bangladesh
Mr. Muhammad Imran Khan, Universiti Teknologi PETRONAS, Malaysia
Dr. Abdul Kareem M. Radhi, Information Engineering - Nahrin University, Iraq
Dr. Mohd Nazri Ismail, University of Kuala Lumpur, Malaysia
Dr. Manuj Darbari, BBDNITM, Institute of Technology, A-649, Indira Nagar, Lucknow 226016, India
Ms. Izerrouken, INP-IRIT, France
Mr. Nitin Ashokrao Naik, Dept. of Computer Science, Yeshwant Mahavidyalaya, Nanded, India
Mr. Nikhil Raj, National Institute of Technology, Kurukshetra, India
Prof. Maher Ben Jemaa, National School of Engineers of Sfax, Tunisia
Prof. Rajeshwar Singh, BRCM College of Engineering and Technology, Bahal Bhiwani, Haryana,
India
Mr. Gaurav Kumar, Department of Computer Applications, Chitkara Institute of Engineering and
Technology, Rajpura, Punjab, India
Mr. Ajeet Kumar Pandey, Indian Institute of Technology, Kharagpur, India
Mr. Rajiv Phougat, IBM Corporation, USA
Mrs. Aysha V, College of Applied Science Pattuvam affiliated with Kannur University, India
Dr. Debotosh Bhattacharjee, Department of Computer Science and Engineering, Jadavpur University,
Kolkata-700032, India
Dr. Neelam Srivastava, Institute of engineering & Technology, Lucknow, India
Prof. Sweta Verma, Galgotia's College of Engineering & Technology, Greater Noida, India
Mr. Harminder Singh BIndra, MIMIT, INDIA
Dr. Lokesh Kumar Sharma, Chhattisgarh Swami Vivekanand Technical University, Bhilai, India
Mr. Tarun Kumar, U.P. Technical University/Radha Govinend Engg. College, India
Mr. Tirthraj Rai, Jawahar Lal Nehru University, New Delhi, India
Mr. Akhilesh Tiwari, Madhav Institute of Technology & Science, India
Mr. Dakshina Ranjan Kisku, Dr. B. C. Roy Engineering College, WBUT, India
Ms. Anu Suneja, Maharshi Markandeshwar University, Mullana, Haryana, India
Mr. Munish Kumar Jindal, Punjabi University Regional Centre, Jaito (Faridkot), India
Dr. Ashraf Bany Mohammed, Management Information Systems Department, Faculty of
Administrative and Financial Sciences, Petra University, Jordan
Mrs. Jyoti Jain, R.G.P.V. Bhopal, India
Dr. Lamia Chaari, SFAX University, Tunisia
Mr. Akhter Raza Syed, Department of Computer Science, University of Karachi, Pakistan
Prof. Khubaib Ahmed Qureshi, Information Technology Department, HIMS, Hamdard University,
Pakistan
Prof. Boubker Sbihi, Ecole des Sciences de L'Information, Morocco
Dr. S. M. Riazul Islam, Inha University, South Korea
Prof. Lokhande S.N., S.R.T.M.University, Nanded (MH), India
Dr. Vijay H Mankar, Dept. of Electronics, Govt. Polytechnic, Nagpur, India
Dr. M. Sreedhar Reddy, JNTU, Hyderabad, SSIETW, India
Mr. Ojesanmi Olusegun, Ajayi Crowther University, Oyo, Nigeria
Ms. Mamta Juneja, RBIEBT, PTU, India
Dr. Ekta Walia Bhullar, Maharishi Markandeshwar University, Mullana Ambala (Haryana), India
Prof. Chandra Mohan, John Bosco Engineering College, India
Mr. Nitin A. Naik, Yeshwant Mahavidyalaya, Nanded, India
Mr. Sunil Kashibarao Nayak, Bahirji Smarak Mahavidyalaya, Basmathnagar Dist-Hingoli., India
Prof. Rakesh.L, Vijetha Institute of Technology, Bangalore, India
Mr B. M. Patil, Indian Institute of Technology, Roorkee, Uttarakhand, India
Mr. Thipendra Pal Singh, Sharda University, K.P. III, Greater Noida, Uttar Pradesh, India
Prof. Chandra Mohan, John Bosco Engg College, India
Mr. Hadi Saboohi, University of Malaya - Faculty of Computer Science and Information Technology,
Malaysia
Dr. R. Baskaran, Anna University, India
Dr. Wichian Sittiprapaporn, Mahasarakham University College of Music, Thailand
Mr. Lai Khin Wee, Universiti Teknologi Malaysia, Malaysia
Dr. Kamaljit I. Lakhtaria, Atmiya Institute of Technology, India
Mrs. Inderpreet Kaur, PTU, Jalandhar, India
Mr. Iqbaldeep Kaur, PTU / RBIEBT, India
Mrs. Vasudha Bahl, Maharaja Agrasen Institute of Technology, Delhi, India
Prof. Vinay Uttamrao Kale, P.R.M. Institute of Technology & Research, Badnera, Amravati,
Maharashtra, India
Mr. Suhas J Manangi, Microsoft, India
Ms. Anna Kuzio, Adam Mickiewicz University, School of English, Poland
Mr. Vikas Singla, Malout Institute of Management & Information Technology, Malout, Punjab, India,
India
Dr. Dalbir Singh, Faculty of Information Science And Technology, National University of Malaysia,
Malaysia
Dr. Saurabh Mukherjee, PIM, Jiwaji University, Gwalior, M.P, India
Dr. Debojyoti Mitra, Sir Padampat Singhania University, India
Prof. Rachit Garg, Department of Computer Science, L K College, India
Dr. Arun Kumar Gupta, M.S. College, Saharanpur, India
Dr. Todor Todorov, Institute of Mathematics and Informatics, Bulgarian Academy of Sciences,
Bulgaria
Mr. Akhter Raza Syed, University of Karachi, Pakistan
Mrs. Manjula K A, Kannur University, India
Prof. M. Saleem Babu, Department of Computer Science and Engineering, Vel Tech University,
Chennai, India
Dr. Rajesh Kumar Tiwari, GLA Institute of Technology, India
Dr. V. Nagarajan, SMVEC, Pondicherry university, India
Mr. Rakesh Kumar, Indian Institute of Technology Roorkee, India
Prof. Amit Verma, PTU/RBIEBT, India
Mr. Sohan Purohit, University of Massachusetts Lowell, USA
Mr. Anand Kumar, AMC Engineering College, Bangalore, India
Dr. Samir Abdelrahman, Computer Science Department, Cairo University, Egypt
Dr. Rama Prasad V Vaddella, Sree Vidyanikethan Engineering College, India
Prof. Jyoti Prakash Singh, Academy of Technology, India
Mr. Peyman Taher, Oklahoma State University, USA
Dr. S Srinivasan, PDM College of Engineering, India
Mr. Muhammad Zakarya, CIIT, Pakistan
Mr. Williamjeet Singh, Chitkara Institute of Engineering and Technology, India
Mr. G.Jeyakumar, Amrita School of Engineering, India
Mr. Harmunish Taneja, Maharishi Markandeshwar University, Mullana, Ambala, Haryana, India
Dr. Sin-Ban Ho, Faculty of IT, Multimedia University, Malaysia
Mrs. Doreen Hephzibah Miriam, Anna University, Chennai, India
Mrs. Mitu Dhull, GNKITMS Yamuna Nagar Haryana, India
Mr. Neetesh Gupta, Technocrats Inst. of Technology, Bhopal, India
Ms. A. Lavanya, Manipal University, Karnataka, India
Ms. D. Pravallika, Manipal University, Karnataka, India
Prof. Ashutosh Kumar Dubey, Assistant Professor, India
Mr. Ranjit Singh, Apeejay Institute of Management, Jalandhar, India
Mr. Prasad S.Halgaonkar, MIT, Pune University, India
Mr. Anand Sharma, MITS, Lakshmangarh, Sikar (Rajasthan), India
Mr. Amit Kumar, Jaypee University of Engineering and Technology, India
Prof. Vasavi Bande, Computer Science and Engneering, Hyderabad Institute of Technology and
Management, India
Dr. Jagdish Lal Raheja, Central Electronics Engineering Research Institute, India
Mr G. Appasami, Dept. of CSE, Dr. Pauls Engineering College, Anna University - Chennai, India
Mr Vimal Mishra, U.P. Technical Education, Allahabad, India
Dr. Arti Arya, PES School of Engineering, Bangalore (under VTU, Belgaum, Karnataka), India
Mr. Pawan Jindal, J.U.E.T. Guna, M.P., India
Prof. Santhosh.P.Mathew, Saintgits College of Engineering, Kottayam, India
Dr. P. K. Suri, Department of Computer Science & Applications, Kurukshetra University, Kurukshetra,
India
Dr. Syed Akhter Hossain, Daffodil International University, Bangladesh
Mr. Nasim Qaisar, Federal Urdu Univetrsity of Arts , Science and Technology, Pakistan
Mr. Mohit Jain, Maharaja Surajmal Institute of Technology (Affiliated to Guru Gobind Singh
Indraprastha University, New Delhi), India
Dr. Shaveta Rani, GZS College of Engineering & Technology, India
Dr. Paramjeet Singh, GZS College of Engineering & Technology, India
Prof. T Venkat Narayana Rao, Department of CSE, Hyderabad Institute of Technology and
Management , India
Mr. Vikas Gupta, CDLM Government Engineering College, Panniwala Mota, India
Dr Juan Jos Martnez Castillo, University of Yacambu, Venezuela
Mr Kunwar S. Vaisla, Department of Computer Science & Engineering, BCT Kumaon Engineering
College, India
Prof. Manpreet Singh, M. M. Engg. College, M. M. University, Haryana, India
Mr. Syed Imran, University College Cork, Ireland
Dr. Namfon Assawamekin, University of the Thai Chamber of Commerce, Thailand
Dr. Shahaboddin Shamshirband, Islamic Azad University, Iran
Dr. Mohamed Ali Mahjoub, University of Monastir, Tunisia
Mr. Adis Medic, Infosys ltd, Bosnia and Herzegovina
Mr Swarup Roy, Department of Information Technology, North Eastern Hill University, Umshing,
Shillong 793022, Meghalaya, India
Mr. Suresh Kallam, East China University of Technology, Nanchang, China
Dr. Mohammed Ali Hussain, Sai Madhavi Institute of Science & Technology, Rajahmundry, India
Mr. Vikas Gupta, Adesh Instutute of Engineering & Technology, India
Dr. Anuraag Awasthi, JV Womens University, Jaipur, India
Dr. Dr. Mathura Prasad Thapliyal, Department of Computer Science, HNB Garhwal University (Centr
al University), Srinagar (Garhwal), India
Mr. Md. Rajibul Islam, Ibnu Sina Institute, University Technology Malaysia, Malaysia
Mr. Adnan Qureshi, University of Jinan, Shandong, P.R.China, P.R.China
Dr. Jatinderkumar R. Saini, Narmada College of Computer Application, India
Mr. Mueen Uddin, Universiti Teknologi Malaysia, Malaysia
Mr. S. Albert Alexander, Kongu Engineering College, India
Dr. Shaidah Jusoh, Zarqa Private University, Jordan
Dr. Dushmanta Mallick, KMBB College of Engineering and Technology, India
Mr. Santhosh Krishna B.V, Hindustan University, India
Dr. Tariq Ahamad Ahanger, Kausar College Of Computer Sciences, India
Dr. Chi Lin, Dalian University of Technology, China
Prof. VIJENDRA BABU.D, ECE Department, Aarupadai Veedu Institute of Technology, Vinayaka
Missions University, India
Mr. Raj Gaurang Tiwari, Gautam Budh Technical University, India
Mrs. Jeysree J, SRM University, India
Dr. C S Reddy, VIT University, India
Mr. Amit Wason, Rayat-Bahra Institute of Engineering & Bio-Technology, Kharar, India
Mr. Yousef Naeemi, Mehr Alborz University, Iran
Mr. Muhammad Shuaib Qureshi, Iqra National University, Peshawar, Pakistan, Pakistan
Dr Pranam Paul, Narula Institute of Technology Agarpara. Kolkata: 700109; West Bengal, India
Dr. G. M. Nasira, Sasurie College of Enginering, (Affliated to Anna University of Technology
Coimbatore), India
Dr. Manasawee Kaenampornpan, Mahasarakham University, Thailand
Mrs. Iti Mathur, Banasthali University, India
Mr. Avanish Kumar Singh, RRIMT, NH-24, B.K.T., Lucknow, U.P., India
Dr. Panagiotis Michailidis, University of Western Macedonia, Greece
Mr. Amir Seyed Danesh, University of Malaya, Malaysia
Dr. Terry Walcott, E-Promag Consultancy Group, United Kingdom
Mr. Farhat Amine, High Institute of Management of Tunis, Tunisia
Mr. Ali Waqar Azim, COMSATS Institute of Information Technology, Pakistan
Mr. Zeeshan Qamar, COMSATS Institute of Information Technology, Pakistan
Dr. Samsudin Wahab, MARA University of Technology, Malaysia
Mr. Ashikali M. Hasan, CelNet Security, India
TABLE OF CONTENTS
1. The Use of Design Patterns in a Location-Based GPS Application 1-6
David Gillibrand and Khawar Hameed
2. An Agent-based Strategy for Deploying Analysis Models into Specification and 7-18
Design for Distributed APS Systems
Luis Antonio de Santa-Eulalia, Sophie D Amours and Jean-Marc Frayret
3. Facial Expression Classification Based on Multi Artificial Neural Network and Two 19-26
Dimensional Principal Component Analysis
Le Hoang Thai, Tat Quang Phat and Tran Son Hai
4. Withdrawn
8. Higher Order Programming to Mine Knowledge for a Modern Medical Expert 64-72
System
Nittaya Kerdprasop and Kittisak Kerdprasop
11. Active Fault Tolerant Control-FTC-Design for Takagi-Sugeno Fuzzy Systems with 88-96
Weighting Functions Depending on the FTC
Atef Khedher, Kamel Ben Othman and Mohamed Benrejeb
12. Efficient Spatial Data mining using Integrated Genetic Algorithm and ACO 97-105
K Sankar and V Vankatachalam
14. Arithmetic and Frequency Filtering Methods of Pixel-Based Image Fusion 113-122
Techniques
Firouz Abdullah Al-Wassai, N. V. Kalyankar and Ali A Al-Zuky
15. Using Fuzzy Decision-Making in E-tourism Industry: A Case Study of Shiraz city 123-127
E-tourism
Zohreh Hamedi and Shahram Jafari
16. A Reliable routing algorithm for Mobile Adhoc Networks based on fuzzy logic 128-133
Arash Dana, Golnoosh Ghalavand, Azadeh Ghalavand and Fardad Farokhi
18. A Frame Work for Frequent Pattern Mining Using Dynamic Function 141-146
Sunil Joshi, R S Jadon and R C Jain
19. Decision Support System for Medical Diagnosis Using Data Mining 147-153
D Senthil Kumar, G Sathyadevi and S Sivanesh
23. A Thought Structure for Complex Systems Modeling Based on Modern Cognitive 182-187
Perspectives
Kamal Mirzaie, Mehdi N Fesharaki and Amir Daneshgar
26. Normalized Distance Measure-A Measure for Evaluating MLIR Merging 209-214
Mechanisms
Chetana Sidige, Sujatha Pothula, Raju Korra, Madarapu Naresh Kumar and Mukesh
Kumar
27. Brain Extraction and Fuzzy Tissue Segmentation in Cerebral 2D T1-Weigthed 215-223
Magnetic Resonance Images
Bouchaib Cherradi, Omar Bouattane, Mohamed Youssfi and Abdelhadi Raihani
28. A New Round Robin Based Scheduling Algorithm for Operating Systems-Dynamic 224-229
Quantum Using the Mean Average
Abbas Noon, Ali Kalakech and Seifedine Kadry
29. A Multi-Modal Recognition System Using Face and Speech 230-236
Samir Akrouf, Belayadi Yahia, Mostefai Messaoud and Youssef Chahir
33. Semantic annotation of requirements for automatic UML class diagram generation 259-264
Soumaya Amdouni, Soumaya Amdouni, Wahiba Ben Abdessalem Karaa and Sondes
Bouabid
35. A Neural Network Model for Construction Projects Site Overhead Cost Estimating 273-283
in Egypt
Ismaail ElSawy, Hossam Hosny and Mohammed Abdel Razek
42. Data Structure and Algorithm for Combination Tree To Generate Test Case 330-333
Ravi Prakash Verma, Bal Gopal and Md Rizwan Beg
43. Generation of test cases from software requirements using combination trees 334-340
Ravi Prakash Verma, Bal Gopal and Md Rizwan Beg
45. Transmission Power Level Selection Method Based On Binary Search Algorithm 348-353
for HiLOW
Lingeswari V Chandra, Selvakumar Manickam, Kok-Soon Chai and Sureswaran
Ramadass
49. Power Efficient Higher Order Sliding Mode Control of SR Motor for Speed 378-387
Control Applications
Muhammad Rafiq, Saeed-ur-Rehman, Fazal-ur-Rehman and Qarab Raza
50. Semantic Search in Wiki using HTML5 Microdata for Semantic Annotation 388-394
P Pabitha, K R Vignesh Nandha Kumar, N Pandurangan, R Vijayakumar and M
Rajaram
51. Formal Verification of Finger Print ATM Transaction through Real Time 395-400
Constraint Notation RTCN
Vivek Kumar Singh, Tripathi S.P, R P Agarwal and Singh J.B.
54. A Novel Feature Selection method for Fault Detection and Diagnosis of Control 415-421
Valves
Binoy B Nair, M T Vamsi Preetam, Vandana R Panicker, V Grishma Kumar and A
Tharanya
55. A Survey on Data Mining and Pattern Recognition Techniques for Soil Data 422-428
Mining
D Ashok Kumar and N Kannathasan
56. Markov Model for Reliable Packet Delivery in Wireless Sensor Networks 429-432
Vijay Kumar, R B Patel, Manpreet Singh and Rohit Vaid
58. IBook-Interactive and Semantic Multimedia Content Generation for eLearning 438-443
Arjumand Younus, M Atif Qureshi, Muhammad Saeed, Syed Asim Ali, Nasir Touheed
and M Shahid Qureshi
60. Image Compression Using Wavelet Transform Based on the Lifting Scheme and its 449-453
Implementation
A Alice Blessie, J Nalini and S C Ramesh
61. Incorporating Agent Technology for Enhancing the Effectiveness of E-learning 454-461
System
N Sivakumar, K Vivekanandan, B Arthi, S Sandhya and Veenas Katta
62. Linear Network Coding on Multi-Mesh of Trees using All to All Broadcast 462-471
Nitin Rakesh and Vipin Tyagi
65. Enhanced Stereo Matching Technique using Image Gradient for Improved Search 483-486
Time
Pratibha Vellanki and Madhuri Khambete
66. Analyzing the Impact of Scalability on QoS-aware Routing for MANETs 487-495
Rajneesh Kumar Gujral and Manpreet Singh
67. Improving Data Association Based on Finding Optimum Innovation Applied to 496-507
Nearest Neighbor for Multi-Target Tracking in Dense Clutter Environment
E M Saad, El Bardawiny, H I Ali and N M Shawky
68. An Efficient Quality of Service Based Routing Protocol for Mobile Ad Hoc 508-514
Networks
Tapan Kumar Godder, M. M Hossain, M Mahbubur Rahman and Md. Sipon Mia
69. SEWOS-Bringing Semantics into Web operating System 515-521
A.M. Riad, Hamdy K Elminir, Mohamed Abu ElSoud and Sahar F Sabbeh
70. Segmenting and Hiding Data Randomly Based on Index Channel 522-529
Emad T Khalaf and Norrozila Sulaiman
71. Data-Acquisition Data Analysis and Prediction Model for Share Market 530-534
Harsh Shah and Sukhada Bhingarkar
72. Fast Handoff Implementation by using Curve Fitting Equation With Help of GPS 535-542
Debabrata Sarddar, Shubhajeet Chatterjee, Ramesh Jana, Shaik Sahil Babu, Hari
Narayan Khan, Utpal Biswas and Mrinal Kanti Naskar
73. Visual Cryptography Scheme for Color Image Using Random Number with 543-549
Enveloping by Digital Watermarking
Shyamalendu Kandar, Arnab Maiti and Bibhas Chandra Dhara
74. Computation of Multiple Paths in MANETs Using Node Disjoint Method 550-554
M Nagaratna, P V S Srinivas, V Kamakshi Prasad and C Raghavendra Rao
78. Enhancing the Capability of N-Dimension Self-Organizing Petrinet using Neuro- 569-571
Genetic Approach
Manuj Darbari, Rishi Asthana, Hasan Ahmed and Neelu Jyoti Ahuja
82. Simulation and Optimization of MQW based optical modulator for on chip optical 592-596
interconnect
Sumita Mishra, Naresh K Chaudhary and Kalyan Singh
83. Determination of the Complex Dielectric Permittivity Industrial Materials of the 597-601
Adhesive Products for the Modeling of an Electromagnetic Field at the Level of a Glue
Joint
Mahmoud Abbas and Mohammad Ayache
Staffordshire University,
The Octagon, Beaconside, Stafford, ST18 0AD
value-added information and enhanced experience to through its mobile application distribution channel - the
mobile consumers, and through emerging demand levels, 'App Store', had twenty different categories of mobile
typical services and associated business models [5]. applications with 268 applications within the 'travel'
Location-based systems are strongly coupled to the category, and a large number of applications across all
concept of context within mobile computing systems and categories that exploited the user's location profile as part
form a special class of context-aware systems [6]. of the application configuration.
Location is a determinant in that it contributes
significantly to the universe of discourse created - Supporting technology for location-based systems
essentially, all activities are hosted within a particular typically includes mobile network platforms for
environmental location and context. A key characteristic determining location including wide area systems such
of location-based systems is the changing physical Cell Identification in mobile radio networks, Global
location of the mobile user, which may be continual - such Positioning Systems (GPS), and Broadband Satellites [5]
as when in a moving vehicle or walking, or periodic - and more localised sensor technologies such as WiFi
where there are periods of short-term or transient (802.11), Bluetooth, and Radio Frequency Identification
residency of the user in a location. (RFID) [9]. Spatial databases provide the core repository
infrastructure to host multi-dimensional data, with
Kakihara and Sorensen [7] discuss the view of spatial associated data models and query capabilities that enable
mobility as one dimension of mobility that is, the most location-based queries to be satisfied. The end-to-end
immediate aspect of mobility in that the physical delivery of mobile location-based systems includes a
locational space provides the immersive context for number of stakeholders, each of which is critical to the
objects within that space. This discussion further operation of the complete system. These include mobile
articulates three composite aspects that of the mobility of network operators, content providers and aggregators,
objects, the mobility of symbols, and the mobility of space technology infrastructure providers, application service
itself. The dimensional aspects of location lead to a providers, and device manufacturers. As the potential
potentially more complex universe of discourse scope and opportunities offered by mobile location-based
comprising the determination of object positioning within systems increase, there is a risk of increasing complexity
space (for example using co-ordinate geometry) where leading to evaluation of suitable business models and
location identification is not only based on triangulation of frameworks and components that address the overall
co-ordinates - where each co-ordinate represents a aggregation of services [10][11]. Whilst appreciating this
particular dimension, but also based on time where increasing complexity at higher levels of abstraction in
objects move through space and time. In this case, location-based systems, we posit that an equal focus and
objects can be deemed to possess an orthogonal property effort on the use of design patterns to formalize and
where different locations in time exist for those objects. structure the lower-level construction of such systems is of
merit.
The transformation of these spatial concepts into real-
world deployments of location-based systems is also The remainder of this section introduces the example of a
evident. Within the public sector there is GPS application as a component of a location-based
acknowledgement of the unique features and advantages system. This illustration is subsequently developed and
of mobile technologies to enhance engagement between serves a vehicle for the articulation of the associated
governmental institutions and the citizens they serve design patterns.
through the development of innovative location-based
services and new methods of interaction [8]. Specific The GPS application consists of reading data from a GPS
examples of mobile location-based applications include receiver which constantly sends a stream of $GPRMC
those concerned with supporting front-line emergency sentences to a GPS class. An example of a sentence is:
services for public security and safety [6] with a range of $GPRMC, 140036,A, 5226.5059, N, 00207.6806, W,2.0,
associated improvement and efficiency gains being 064.64, 120710,001.0,E*34 where 194322 is the time of
reported. fix (14:00:36 UTC), A is a navigation receiver warning (A
= OK, V = warning), 5226.5059,N is Latitude (52 deg.
Private sector interest in mobile location-based systems is 26.5059 min North), 00207.6806,W is Longitude (002 deg.
underpinned by new commercial and revenue-generating 07.6806 min West), 2.0 is Speed over ground (Knots),
opportunities evidenced primarily by numerous consumer- 064.64 is Course Made Good( degrees), 120710 is Date of
oriented applications in a number of categories including fix (12 July 2010), 001.0,E is the Magnetic variation (1.0
navigation and travel, social networking, leisure and deg East), *34 is the mandatory checksum.
entertainment. For example, in September 2009 Apple,
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 3
has also been incorporated into a Masters level course in David Gillibrand is a Senior Lecturer in the Faculty of Computing,
Engineering & Technology at Staffordshire University. His research
Mobile Applications and Systems where students are is in the area of Object-Oriented technologies, Design Patterns,
taught the conceptual basis of mobility and location-based Enterprise Applications, Mobile programming, Databases, and
systems, and the associated development of software System methods. He has had publications in object-oriented
underpinned by design patterns. The approach to design journals and delivered courses in system design to industry.
patterns in this paper can be adopted by both scholars and
Khawar Hameed is a Principal Lecturer in the Faculty of
practitioners in industry. Our future work in this area is Computing, Engineering & Technology at Staffordshire University.
concerned with development of schematic constructs to His research is in the area of mobile and remote working,
model the multi-dimensional aspects of mobility those of enterprise mobility, and mobile learning. He has been a key driver
time, space, and context and to position these at different in the adoption of mobile computing and technology within the
Faculty's portfolio and has helped drive the development of
levels of abstraction within the system development undergraduate and post- graduate degrees in this technology area.
process - such as design patterns at lower levels of He has contributed extensively to the development and delivery of
abstraction, and enterprise architecture-based constructs at externally funded projects and academic-industrial collaborations
in mobile/wireless technology that aim to develop and enhance the
higher levels of abstraction. In doing so, we aim to focus collective intellectual capital that supports the growth of mobile and
on a systemic and structured approach to the development wireless systems as a discipline both within academia and in
of mobile applications and systems. industry.
References
[1] C. Alexander Pattern Language: Towns, Buildings, Construction.
1977
[2] Gamma , Helm, Johnson, Vlissides, Design Patterns:
Elements of Reusable Object-Oriented Software, Addison-
Wesley 1994
[3] Duri, S., Cole, A., Munson, J. & Christensen, J., 2001,
WMC '01: Proceedings of the 1st international workshop on
Mobile commerce, An approach to providing a seamless
end- user experience for location-aware applications. ACM,
pp. 20-5.
[4] D'Roza, T. & Bilchev, G., 2003, An overview of location-
based services, BT Technology Journal, 21(1), pp. 20-7.
[5] Rao, B. & Minakakis, L., 2003, Evolution of mobile
location-based services, Commun. ACM, 46(12), pp. 61-5.
[6] Streefkerk, J.W., van Esch-Bussemakers, M.P. & Neerincx,
M.A., 2008, MobileHCI '08: Proceedings of the 10th
international conference on Human computer interaction
with mobile devices and services, Field evaluation of a
mobile location-based notification system for police
officers. ACM, pp. 101-8.
[7] Kakihara, M. & Srensen, C., 2001, Expanding the
'mobility' concept, SIGGROUP Bull., 22(3), pp. 33-7.Trimi,
& Sheng 2008
[8] Trimi, S. & Sheng, H., 2008, Emerging trends in M-
government, Commun. ACM, 51(5), pp. 53-8.
[9] Johnson, S., 2007, A framework for mobile context-aware
applications, BT Technology Journal, 25(2), pp. 106-11.
[10] Aphrodite & Evaggelia, 2001, Business models and
transactions in mobile electronic commerce: requirements
and properties, Computer Networks, 37(2), pp. 221-36.
[11] de Reuver, M. & Haaker, T., 2009, Designing viable
business models for context-aware mobile services,
Telematics and Informatics, 26(3), pp. 240-8.
[12] Freeman, Freeman, Sierra, Bates Head First Design Patterns
OReilly 2004
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 7
ISSN (Online): 1694-0814
www.IJCSI.org
2
Universit Laval
Quebc City, Qubec, Canada
3
cole Polytechnique de Montral
Montral, Qubec, Canada
Advanced Planning and Scheduling (APS) systems To cope with this problem, recent advances in supply
comprise a set of techniques for the supply chain planning chain planning have arisen in the area of agent technology.
over short, intermediate, and long-term time periods. They This technology is able to capture the distributed nature of
employ advanced mathematical algorithms or logic to supply chain entities (e.g. customers, manufacturers,
perform optimization or simulation on finite capacity logistics operators etc.) and mimic their business
scheduling, sourcing, capital planning, resource planning, behaviours (e.g. making advanced production decisions
forecasting, demand management, and other. APS and negotiating with other supply chain members), thus
simultaneously considers a range of constraints and supporting their collaborative planning process. Because
business rules to provide real-time planning and of these abilities, among several others described in the
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 8
ISSN (Online): 1694-0814
www.IJCSI.org
literature, agent-based supply chain systems have great In order to facilitate FAMASS analysts in converting their
potential for simulating complex and realistic scenarios [7, analysis models into specification and design models, this
4; 9, 10, 11]. Distributed APS systems employing agent- paper proposes an agent-based deployment strategy. This
based technology are referred to in this paper as distributed strategy enlarges the FAMASS scope to the other
APS systems [12]. modelling phases, thus covering the entire modelling
cycle. By doing so, analysis can go smoother and quicker
Distributed APS systems are normally developed through through this cycle.
the use of modelling and simulation frameworks and,
usually, these frameworks provide principles, steps, To do so, we were inspired by the specification and design
methods and tools for creating a model. They help people principles of the Labarthe et al. [9] framework, a recent
understand the simulation problem to be modelled and and largely cited development in the field of
translate it into a computing model normally used in methodological agent-oriented framework for supply chain
simulation experiments in the supply chain planning area. management simulation. Since the focus of this framework
is on supply chain management as a general concept (and
In order to create such models, these frameworks guide not specialized in APS systems), we had to perform some
simulation modellers through one or several development minor adaptations to this approach. Despite these
steps [13]. The first modelling step is analysis, where one adaptations, the main ideas of Labarthe et al. [9] are
defines an abstract description of the modelled supply explicitly considered in the deployment strategy. The
chain planning system containing functional and non- Labarthe et al. framework is adopted here because it
functional requirements. Next, during specification, the covers the specification and design phases properly at the
information derived from the analysis is translated into a business and agent levels, just as FAMASS does, which
formal model. As the analysis phase does not necessarily facilitates the deployment process.
allow obtaining a formal model, the specification
examines the analysis requirements and builds a model This deployment strategy demonstrates that the analysis
based on a formal approach. After, in the design phase one phase of FAMASS can be integrated with other existing
creates a data-processing model that describes the approaches specialized in specification and design
specification model in more detail. In the case of an agent- modelling. Furthermore, it allows us to avoid the research
based system, design models are close to how agents effort needed to develop a totally new specification and
operate. Finally, during implementation, the design model design methodology for the domain, although it would be
is translated into a specific software platform or it is suitable (and even desirable) for future research initiatives.
programmed [13].
This paper is organized as follows: a literature review in
The problem behind these modelling frameworks is that modelling and simulation for distributed APS systems is
normally simulation systems are implemented as directed presented in Section 2. Section 3 introduces the FAMASS
by pre-stated requirements with little explicit focus on approach, while Section 4 summarizes the Labarthe et al.
system analysis, specification, design and implementation [9] framework. Next, the deployment process is explained
in an integrated manner [14]. According to a recent in Section 5. Finally, Section 6 outlines some final
literature review [15], to the best of our knowledge there remarks and suggests future work.
are no integrated modelling approaches covering the whole
developed process in this area. Moreover, there is one
unique analysis modelling, the FAMASS (FORAC 2. Modelling and Simulation Frameworks for
Architecture for Modelling Agent-based Simulations for distributed APS
Supply chain planning) framework, dedicated to the
distributed APS domain, and which was proposed by us The use of agent technology in Supply Chain Management
recently [21, 22, 23]. is a fruitful field. From the inaugural work of Fox et al.
[16] until today, a large variety of works have appeared to
Despite its contribution to the literature, FAMASS is propose different ways of encapsulating supply chain
limited to the identification and mapping of functional entities and performing simulation experiments.
requirements of distributed APS simulations, i.e. the
analysis phase only. If the simulation analysts desire to Two types of modelling approaches can be identified in
go further in the modelling process, they have to employ the literature. The first type proposes generic approaches
another specification and design methodology. This can for modelling agent-based supply chain systems in general
be laborious, since analysts need to thoroughly master terms, while the second type proposes a modelling
FAMASS and another methodology. framework that specifically takes into consideration
Advanced Planning and Scheduling (APS) tools when
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 9
ISSN (Online): 1694-0814
www.IJCSI.org
planning, i.e. the incorporate optimization procedures or advances, there exists a relevant gap in this field related to
finite capacity planning models when performing supply the initial developing step of such simulation systems, the
chain planning. APS systems emerged in the last decade to analysis phase [12]. Most of the researched works in the
provide a suite of planning and scheduling modules for the literature suggest approaches for specification and design,
firms internal supply chain, from the raw materials source and some for implementation, but the analysis phase is not
to the consumers and covering decisions ranging from the explicitly treated [12, 13, 14, 21]. Most of these works
strategic to the operational level [17]. suppose that the analysis phase furnishes the necessary
information and concentrate their discussions on further
In the first type of approach (general agent-based models), phases, mainly specification and design. The first work
examples of relevant contributions include Labarthe et al. dedicated to the analysis of distributed APS systems using
[9], Van der Zee, and Van der Vorst [18], MaMA-S [13]. the agent-based paradigm is FAMASS [21]. Despite its
One of the most cited works in the domain is Labarthe et contribution to the agent-based modelling of distributed
al. [9], which propose a methodological framework for APS systems, FAMASS does not cover the specification
modelling customer-centric supply chains in the context of and design phases of the development process. This is an
mass customization. They define a conceptual model for interesting research gap in the literature. Section 3 details
supply chain modelling and show how the multi-agent the FAMASS approach for the analysis phase, while
system can be implemented using predefined agent Section 4 presents a frequently cited method for
platforms. Van der Zee and Van der Vorst [18] propose an specification and design of agent-based supply chain
agent framework derived from an object-oriented approach systems from Labarthe et al. [9]. Next, Section 5 combines
to explicitly model control structures of supply chains. these two approaches in order to create a deployment
MaMA-S [13] provides a multi-agent methodology for a strategy to translate analysis models into specification and
distributed industrial system, which is divided into five design.
main phases and two support phases. The authors propose
formal methods for the specification, design and
implementation phases, but the analysis phase is not 3. The FAMASS Approach
tackled by them.
The FAMASS (FORAC Architecture for Modelling
This second type of modelling approach provides more Agent-based Simulation for Supply chain planning) is the
sophisticated models of supply chains by incorporating first and unique modelling approach dedicated to the
analysis phase of distributed APS simulations [21, 22, 23].
Advanced Planning and Scheduling routines [12]. These
This approach was recently tested in Santa-Eulalia et al.
approaches, sometimes called d-APS systems (for [24].
distributed APS), are composed of semi-autonomous APS
tools, each dedicated to a specialized planning area and It is organized into two abstraction levels: Supply
that can act together in a collaborative manner employing chain: refers to the supply chain planning problem, i.e. the
sophisticated interaction schemas. business viewpoint; Agent: the supply chain domain
problem is translated into an agent-based view (Figure 1).
Examples of this kind of work are Egri et al. [19], At these two abstraction levels, four modelling approaches
Lendermann et al. [20] and Swaminathan et al. [11]. Egri are proposed, namely the General Problem Analysis
et al. [19] is a Gaia-based approach for modelling (GPA), the Distributed Planning Analysis (DPA), the
advanced distributed supply chain planning for mass Social Agent Organization Analysis (SAOA) and the
customization. They develop a model for representing Individual Agent Organization Analysis (IAOA), as
roles and interactions of agents based on the SCOR schematized in Fig. 1.
(Supply-Chain Operations Reference) model. Lendermann !"#$%&'&()*#&+(
et al. [20] developed an approach to couple discrete-event
simulation and APS for collaborative supply chain !"#"$%&'
($)*&"+',#%&-./.'
optimization, based on the HLA (High Level Architecture)
74>>&-'?@%/#'
0!(,1'
technology for distributed simulation synchronization.
2/.3$/*43"5'
Swaminathan et al. [11] provide a supply chain modelling (&%##/#6',#%&-./.'
framework containing a library of modular and reusable 02(,1'
,6"#3'
9$6%#/:%;)#'
9$6%#/:%;)#'
,#%&-./.'
their interaction protocols. 07,9,1'
,#%&-./.'
0<,9,1'
These four modelling approaches are explained in the are proposed in the framework of Fig. 2, which is called the
following subsections. supply chain planning cube.
#&1':"
3.1 General Problem Analysis (GPA) 9):$%);4,/0"
#$%&$'()*"
7&048&*$4%)0("
However, in some situations a Supply Chain Block can At the individual level, agents can be organized
be transformed into more than one agent, for example when according to different internal architectures but there is
specialization is required, in which case a planning agent little consensus on how to conceive the internal
can be specialized according to certain generic architectures of agents [30] in the literature. In order to
responsibility orientations, such as products, processors, cope with this, the metamodel for the IAOA proposes that
processes or projects, to obtain faster or more precise whatever the state of mind of an agent is (cognitive,
responses for certain given situations. In other situations, reactive or hybrid), and whatever internal architecture an
apart from agents proceeding from the supply chain agent employs, an agent can be described simply according
planning cube, different intermediary agents can be created to its abilities. This is the central point when performing
to perform activities related to, e.g. the coordination of the simulation. An ability can be defined as the quality of
agents society. In addition, the agentification process can being able to perform an action, or facilitate the actions
also include the representation of information sources, accomplishment. Abilities allow for the implementation
interfaces and other services. of actions and the determination of the systems behaviour,
as well as the determination of its related performance.
The importance of this discussion relies on the notion
that agentification is the basis for two mutually dependent Based on this notion, the metamodel defines two
aspects in agent-based systems which define the metamodel elements:
for the SAOA:
The Response Space: stands for a collection of general
Social structures: represent the agent system abilities available for the agents, including very simple
architecture [24] characterizing the blueprint of reactive abilities or sophisticated cognitive ones. For
relationships, giving a high level view of how groups example, one agent can have a simple ability to
solve problems and the role each agent plays within monitor the inventory levels of the supply chain, or a
the structure. There are diverse types of social complex ability to perform production planning
structures, such as hierarchical, federated and employing an optimization method.
autonomous.
Capacity to Produce an Adapted Response: represents
Social protocols: are agents abilities concerning social the aptitude to choose which abilities have to be
aspects, normally related to cooperation principles (i.e. transformed into actions at a given time to respond to a
agents have to cooperate in order to plan the entire given situation. This capacity can vary from
supply chain). Diverse abilities can be considered, like elementary to complex. The simplest possible capacity
communication, grouping and multiplication, is related to a reactive if-then mechanism, where no
coordination, collaboration by sharing tasks and cognition is necessary. For example, if the inventory
resources and conflict resolution through negotiation level drops to a given threshold, the agent uses its
and arbitration. procurement ability to start a procurement action. As
the agent becomes more intelligent, more complex
Different social structures and protocols are provided in responses can be made for some given situations. For
Santa-Eulalia [22].
example, the linear if-then logic can be substituted
Similar to the DPA, these two aspects of the SAOA by more complex approaches based on action
serve as a metamodel to help simulation analysis identify optimization and learning.
their requirements for the simulation model. For example, Based on these two elements of the metamodel, one can
different social protocols can be tested in the simulation.
carry out requirements determination for the simulation
Then, requirements can be organized through agent-based model, selecting the desired requirements in terms of
use cases from AUML (Agent Unified Modelling
agents abilities. Similar to the SAOA, the IAOAs
Language) and requirements diagrams from SysML. An requirements are organized through agent-based use cases
example of the SAOA is provided in Santa-Eulalia et al.
from AUML and requirements diagrams from SysML [23].
[23].
FAMASS is detailed in Santa-Eulalia et al. [21, 22, 23].
3.4 Individual Agent Organization Analysis (IAOA) An application of this approach is presented in Santa-
Eulalia et al. [24].
As mentioned by Ferber [29], the task of assigning roles
to every individual agent is normally regarded as the last
phase in constructing an organization. The logic is that as 4. Labarthe et al.s Methodological
soon as one knows what the functions to be assigned are, Framework
one defines individual specializations. These local
assignments influence social protocols functioning inside The Labarthe et al. [9] framework is schematized in Fig. 3
their respective social structures. In addition, it also and is briefly described afterwards.
influences the local performance of the supply chain
planning entities. This is the main idea of the IAOA.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 12
ISSN (Online): 1694-0814
www.IJCSI.org
,"&/4%=05"&'">'?$=$4/@*'*/'$+A'BCDDEF'
!"#$%&'(")*+' proposes the Operational Agent Model (OAM).
,"&-*./0$+'
(")*++%&2'
,"&-*./0$+'12*&/'(")*+'
4.3 Operational Agent Model (OAM)
systems have to be identified in regard to the supply chain The following subsection discusses the Domain Model
cube introduced in subsection 3.2. If we did not consider generation.
entities of the Operating System at this step, the Domain
Model would be incomplete for a distributed APS, 5.1 Domain Model (DM)
according to the definition of the supply chain cube.
The objective of the Domain Model is to identify what is
!"#"$%&' to be modelled in the supply chain. As seen in Fig. 4, the
($)*&"+',#%&-./.'
0!(,1' Distributed Problem Analysis (DPA) can be translated
Supply Chain
2/.3$/*43"5'
directly into the Domain Model.
(&%##/#6',#%&-./.' !"#$%&'(")*+'
02(,1'
Table 1 and Table 2 provide a translation strategy to create
7)8/%&',6"#3'
9$6%#/:%;)#'
<#5/=/54%&'
,6"#3'
/"&0*123$+'4-*&2'
(")*+' FAMASS Structural and Dynamic Models based on
9$6%#/:%;)#'
Labarthe et al. [9].
Agent
,#%&-./.'
,#%&-./.'
07,9,1' 51*6$7"&$+'4-*&2'
0<,9,1'
(")*+'
Class diagrams and class The Conceptual Agent Model represents the agentification
NetMan [31, 32] tables (AUML). All flows are process of the Labarthe et al. [9] approach. The
approach plus a represented by arrows. The
representation of the decoupling point is
agentification process defines the agent society based on
decoupling point position. represented in the class the Domain Model, i.e. which agents are created from the
Modelling The decoupling point name. Centre models are centres (in our case, Supply Chain Block) and how they
formalism position is mentioned represented by arrows as
here because it is an well. Stock holding (raw are organized. Labarthe et al. [9] propose rules for creating
important issue in the material, work-in-process or agents (i.e., each centre becomes an actor-agent and each
Labarthe et al. [9] final products) is
framework. represented in the centre activity becomes an activity-agent). As discussed
operations of each class. before, FAMASS converts each Supply Chain Block into
Apart from the physical an agent. It also verifies whether some agents are
flow identified previously, extinguished (e.g. merged with another agent) or whether
the modelling process
describes the new agents are introduced (e.g. a mediator). This
informational flow information is obtained during the Social Agent
exchanged according to
the dynamics of the
Organization Analysis (SAOA).
environment.
Four informational flow As indicated in Fig. 4, the Conceptual Agent Model is
types for coordination are generated from the Domain Model and the SAOA (in this
identified: i) needs
The same flows are case, the social structures). Using Labarthe et al. [9] rules,
expression; ii) offers
identified, as well as
Modelling
expression; iii)
inventory positions and the Domain Model provides the basic classes definition
information about
process
coordination; and iv)
decoupling point position. and, using the SAOA, it can be verified if new agent
They are described in the
information sharing by
class tables. classes are derived from the Domain Model and if
models exchanges. In
addition, the decoupling different social structures have to be tested and considered
point is positioned and in the Conceptual Agent Model. Social Protocols from
inventories are mapped
(raw material, work-in-
SAOA are not used in Conceptual Agent Modelling.
process and final
product). The Strategy for creating a Conceptual Agent Model is
It identifies two models shown in Table 3.
(for models exchange):
the network model and
the centre model.
Adapted class diagrams, tables and As we believe that the agents from the decision system can
A graphical modelling
Modelling
formalism [34] that models the
package diagrams. The adaptation of
the class diagrams refers to the
also assume reactive behaviours (see subsection 3.2), we
two types of agents and their
formalism
interactions. The CAM model
insertion of objects (products), prefer not to use this agent architecture notation for the
represented by simple square boxes in
is derived from the DM model.
the link between two classes. Operational Agent Model. Instead, we create two societies
1. From centre to actor-
(decision agents and execution agents) from the
agent: each centre creates an Conceptual Agent Model and start to define all agents
actor-agent.
behaviours and agents protocols in detail, as done by
2. Physical
between
interactions
actor-agents:
Labarthe et al. [9], which is not contradictory to Labarthe
physical flow is specified by an et al.s [9] work. As explained before, instead of
arrow linking agents and
indicating their respective separating into decision and execution societies at the
exchanged objects. Operational Agent Model, our approach does it at the
3. Informational interactions beginning of the specification phase, i.e. at the Domain
between actor-agents: similar
to 2, but for information flow. Model.
4. Organizational frontiers
definition: establishes the Similar process, with the following In sum, our Operational Agent Model is generated from
organization frontiers for the differences:
actor-agents and places the - Actor-agents and activity-agents: in the Conceptual Agent Model, the Social Agent
Modelling
process
physical flows between the
organizations.
the classes, use role definitions to
indicate if it is an actor-agent or an
Organization Analysis and the Internal Agent Organization
activity-agent; Analysis, as illustrated in Fig. 5.
5. Definition of the activity- - Interactions: links between classes.
agents: each activity of a
centre is transformed into an
activity-agent. 2'(3#"-4%)*+,#(-*
.'/#)*02+.1*
5'36%)*+,#(-*
6. Physical interactions !$,%(67%&'(*
between activity-agents: +(%)8969*
specify the physical flow 05+!+1*
between the activity-agents
and their related objects !"#$%&'(%)*+,#(-*
exchanged. .'/#)*0!+.1*
:(/6;6/4%)*
+,#(-*
7. Informational interactions !$,%(67%&'(*
between activity-agents: +(%)8969*
0:+!+1*
same as 6, plus the interaction
between actor-agents and
activity-agents.
Fig. 5: Creating an Operational Agent Model.
It is important to note that an actor-agent coordinates a
population of other activity-agents in the Labarthe et al. From the Conceptual Agent Model we represent two
[9] approach. In the case of FAMASS, we decided to use societies, the decision agents and the execution agents.
the notion of actor-agent only as an aggregation of agents This is the starting point of the Operational Agent Model.
inside the same organization using a package diagram. After, we obtain requirements about agent protocols from
the Social Agent Organization Analysis, and we obtain
The next sub-section transforms the Conceptual Agent requirements about agent abilities from the Internal Agent
Model into an Operational Agent Model. Organization Analysis.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 16
ISSN (Online): 1694-0814
www.IJCSI.org
Table 4 summarizes the deployment strategy for the going faster and smoother through the whole modelling
Operational Agent Model. process.
Table 4: Operational Agent Models. In addition, this deployment strategy demonstrates that the
Element Labarthe et al. FAMASS Counterpart analysis phase of FAMASS can be integrated with other
Multi-agent system architecture: existing approaches specialized in specification and design
a cognitive and a reactive agent
society are represented. A cognitive modelling. With this as an impetus, other methodological
agent, together with its frameworks could be inspected in the future so as to verify
corresponding reactive agent, form
the agent-actor. It is a generic Multi-agent system that FAMASS concepts adhere to other frameworks.
architecture to represent entities architecture: cognitive
capable of taking their own agents are seen as decision
decisions and acting accordingly. agents (from the decision
system); reactive agents are
Furthermore, the proposed strategy allows us to avoid the
represented by execution research effort needed to develop a totally new
Central Specification of the software agents (from the execution
elements agent: knowledge, behaviour and system). specification and design methodology for the domain,
interactions of each agent are
defined. For the behaviours, the
although it would be suitable and desirable for future
following entities are defined: a) Specification of the research initiatives. With regard to this, a forthcoming
external event: concerning the software agent: same
communication aspect with external elements, i.e. knowledge, research effort will work on extending the FAMASS
entities of the multi-agent system; behaviour and interactions.
b) internal event: concerning
analysis approach, so as to cover the whole FAMASS life-
internal activities of an agent; c) cycle from analysis to simulation. In this way the proposed
passive state: waiting state; d)
active state, being an elementary deploying strategy launches the basis for this FAMASS-
action or a composite action.
extended version of a complete architecture to deal with
For the multi-agent system agent-based simulations in the context of distributed APS
architecture, Labarthe [34]
proposes his own graphical We used only adapted
systems. Future versions of the FAMASS approach are to
modelling formalism. For the
specification of the software agent
diagrams from AUML. For be published shortly.
behaviours and knowledge
for cognitive behaviours, the Agent representation, we employ
Modelling
Behaviour Representation (ABR) Activity Diagrams. For
formalism
formalism [37] is used. For reactive
agent behaviours, AUML
interactions, we use Protocol References
Diagrams.
formalisms are used, specifically [1] APICS, The Association of Operations Management -
state charts. For interactions, Online Dictionary, Retrieved June, 2008, from
protocol diagrams from AUML are
used. www.apics.org.
1. Create a society of cognitive
[2] K. Kumar, "Technology for supporting supply chain
agents. Incorporate the management", Communications of the ACM, Vol. 44, Bo.
informational flow. 6, 2001, pp. 58-61.
2. Create a society of reactive [3] M. Van Eck, Advanced planning and scheduling: is
agents. Incorporate the physical
flow and the related exchanged
logistics everything? Working Paper, Vrije Universiteit
objects (products). Amsterdam, Amsterdam, 2003.
3. Define the responsibility links [4] J.-M. Frayret, S. D'Amours, A. Rousseau, S. Harvey, S.,
Modelling
between cognitive and reactive Same process, but with and J. Gaudreault, "Agent-based supply chain planning in
agents. different formalisms from
process
AUML.
the forest products industry", International Journal of
5. Specify agent behaviour of the Flexible Manufacturing Systems, Vol. 19, No. 4, 2007, pp.
cognitive society using the Agent
Behaviour Representation (ABR) 358-391.
formalism. [5] A. L. Azevedo, C. Toscano, J. P. Sousa, and A. L. Soares,
6. Specify agents behaviour of the "An advanced agent-based order planning system for
reactive society using statecharts. dynamic networked enterprises", Production Planning &
7. Specify agents interactions Control, Vol. 15, No. 2, 2004, pp. 133144.
through protocol diagrams.
[6] L. Cecere, "A changing technology landscape", Supply
Chain Management Review, Vol. 10, No. 1, 2006.
The next sub-section provides some final remarks and [7] J.-H. Lee, and C.-O. Kim, "Multi-agent systems
applications in manufacturing systems and supply chain
conclusions about the proposed deployment strategy.
management: a review paper" International Journal of
Production Research, Vol. 46, No. 1, 2008, pp. 233-265.
[8] W. Shen, Q. Hao, H. L. Yoon, and D. H. Norrie,
6. Final Remarks and Future Works "Applications of agent-based systems in intelligent
manufacturing: An updated review", Advanced
This paper presents a conversion strategy from the Engineering Informatics, Vol. 20, No. 4, 2006, pp. 415-
FAMASS analysis models into specification and design 431.
models inspired by the methodological agent-based [9] O. Labarthe, B. Espinasse, A. Ferrarini, A., and B.
framework of Labarthe et al. [9]. This strategy facilitates Montreuil, "Toward a methodological framework for
the FAMASS analysts in converting their models and agent-based modelling and simulation of supply chain in a
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 17
ISSN (Online): 1694-0814
www.IJCSI.org
mass customization context", Simulation Modelling Ph.D. Thesis, Facult des Sciences et Gnie, Universit
Practice and Theory, Vol. 15, No. 2, 2007, pp. 113-136. Laval, Canada, 2009. 387p.
[10] H. Baumgaertel, and U. John, Combining agent-based [23] L. A. Santa-Eulalia, S. DAmours, and J.-M. Frayret,
supply net simulation and constraint technology for highly Agent-Based Simulations for Advanced Supply Chain
efficient simulation of supply networks using APS Planning: The FAMASS Methodological Framework for
systems, in 2003 Winter Simulation Conference, 2003. Requirements Analysis and Deployment. Working Paper
[11] J. M. Swaminathan, S. F. Smith, and N. M. Sadeh, CIRRELT-2011-22, CIRRELT Interuniversity Research
"Modeling supply chain dynamics: a multiagent approach", Centre on Enterprise Networks, Logistics and
Decision Sciences, Vol. 29, No. 3, 1998, pp. 607-632. Transportation, 2011, available at www.cirrelt.ca.
[12] L. A. Santa-Eulalia, S. DAmours, and J.-M. Frayret, [24] L. A. Santa-Eulalia, D. At-Kadi, S. DAmours, J.-M.
"Essay on conceptual modeling, analysis and illustration of Frayret, and S. Lemieux, Agent-based experimental
agent-based simulations for distributed supply chain investigations about the robustness of tactical planning and
planning", INFOR Information Systems and Operations control policies in a softwood lumber supply chain,
Research Journal, Vol. 46, No. 2, 2008, pp. 97-116. Production Planning & Control, Special Issue on Applied
[13] S. Galland, F. Grimaud, P. Beaune, and J. P. Campagne, Simulation, Planning and Scheduling Techniques in
"MAMA-S: an introduction to a methodological approach Industry, 1366-5871, first published on February 10th 2011
for the simulation of distributed industrial systems", (iFirst).
International Journal of Production Economics, No. 85, [25] OMG, OMG Systems Modeling Language (OMG
2003, pp. 1131. SysML), Object Management Group Specification
[14] R. Govindu, and R.B. Chinnam, "A software agent- Report, June 2010.
component based framework for multi-agent supply chain [26] J. F. Shapiro, Modeling the supply chain, Duxbury:
modelling and simulation", International Journal of Pacific Grove, 2000.
Modelling and Simulation, Vol. 30, No. 2, 2010. [27] H. Meyr, and H. Stadtler, Types of supply chain, in
[15] L. A. Santa-Eulalia, G. Halladjian, S. DAmours, and J.-M. Supply chain management and advanced planning:
Frayret, Integrated methodological frameworks for concepts, models, software and case studies, Berlin:
modelling agent-based APS systems: a systematic literature Springer, 2004.
review, Working Paper CIRRELT-2011-50, CIRRELT [28] W. Shen, F. Maturana, and D. H. Norrie, "MetaMorph II:
Interuniversity Research Centre on Enterprise Networks, an agent-based architecture for distributed intelligent
Logistics and Transportation, 2011, available at design and manufacturing", Journal of Intelligent
www.cirrelt.ca. Manufacturing, No. 11, 2000, pp. 237-251.
[16] M. S. Fox, J. F. Chionglo, and M. Barbuceanu, "The [29] J. Ferber, Multi-agent systems: an introduction to
integrated supply chain management system", Internal distributed artificial intelligence, Harlow: Addison-Wesley,
Report - Department of Industrial Engineering, University 1999.
of Toronto, Canada, from www.eil.utoronto.ca/iscm- [30] L. Sanya, and W. Hongwei, "Agent architecture for agent-
descr.html, 1993. based supply chain integration & coordination", Software
[17] H. Stadtler, and C. Kilger, Supply chain management and Engineering Notes, Vol. 28, No. 4, 2003.
advanced planning: concepts, models, software and case [31] J.-M. Frayret, S. D'Amours, B. Montreuil, and L. Cloutier, ,
studies, Berlin: Springer, 2004. "A network approach to operate agile manufacturing
[18] D. J. Van der Zee, and J.G.A.J. Van der Vorst, "A systems", International Journal of Production Economics,
Modeling framework for supply chain simulation: Vol. 74, No. 1-3, 2001, pp. 239-259.
opportunities for improved decision making", Decision [32] B. Montreuil, J.-M. Frayret, and S. D'Amours, "A strategic
Sciences, Vol. 36, No. 1, 2005, pp. 65-95. framework for networked manufacturing", Computers in
[19] P. Egri, and J. Vancza, Cooperative planning in the supply Industry, Vol. 42, No. 2-3, 2000, pp. 299-317.
network a multiagent organization model, in CEEMAS [33] B. Montreuil, and P. Lefranois, "Organizing factories as
2005 - 4th International Central and Eastern European responsibility networks", Progress in Material Handling
Conference on Multi-Agent Systems, Budapest, Hungary, Research, 1996, pp. 375-411.
Springer Verlag, 2005. [34] O. Labarthe, Modlisation et simulation orientes agents
[20] P. Lendermann, B. P. Gan, and L. F. McGinnis, de chanes logistiques dans un contexte de personnalisation
Distributed simulation with incorporated APS procedures de masse : modles et cadre mthodologique, Ph.D.
for high-fidelity supply chain optimization, in 2001 Winter Thesis, Universit Laval (Canada) and Universit Paul
Simulation Conference, Arlington, 2001. Czanne (France), 2006.
[21] L. A. Santa-Eulalia, S. D'Amours, and J.-M. Frayret, [35] O. Labarthe, B. Montreuil, A. Ferrarini, and B. Espinasse,
Modeling Agent-Based Simulations for Supply Chain Modelisation multi-agents pour la simulation de chaines
Planning: the FAMASS Methodological Framework, in logistiques de type personnalisation de masse, in 5e
2010 IEEE International Conference on Systems, Man, and Confrence Francophone de MOdlisation et SIMulation:
Cybernetics, Special Session on Collaborative Modlisation et simulation pour lanalyse et loptimisation
Manufacturing and Supply Chains, Istanbul, 10-13 October des systmes industriels et logistiques, MOSIM04. Nantes,
2010. 2004.
[22] L. A. Santa-Eulalia, Agent-based simulations for [36] O. Labarthe, E. Tranvouez, A. Ferrarini, B. Espinasse, and
advanced supply chain planning: a methodological B. Montreuil, A heterogeneous multi-agent modelling for
framework for requirements analysis and deployment,
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 18
ISSN (Online): 1694-0814
www.IJCSI.org
2
Informatics Technology Department, University of Pedagogy, Ho Chi Minh City, Vietnam
Lastly, we use MANNs global frame (GF) consisting in this paper, we used 2D-PCA (rows, columns and block-
some Component Neural Network (CNN) to compose the based) and DiaPCA (diagonal-based0 for extracting facial
classified result of all SNN. feature to be the input of Neural Network.
Column-based Feature
Classification 2D-PCA vector V1
Decision
Diagonal-based Feature
DiaPCA vector V4
pattern Xs final classified result in the i. Clearly, this CNN(s). The weights of CNN(s) evaluate the importance
method is subjectivity and omitted information. of SNN(s) like the reliability coefficients. Our model
The average combination method [4] uses the average combines many Neural Networks, called Multi Artificial
function for all the classified result of all SNN: Neural Network (MANN).
m
P (i | X ) 1 Pk (i | X ) (2)
k 1
m
This method is not subjectivity but it set equal the
importance of all image features.
On the other approach is building the reliability Fig 7. PCA and MANN combination
coefficients attached on each SNNs output [4], [5]. We
can use fuzzy logic, SVM, Hidden Markup Model (HMM)
[6] to build these coefficients:
m
3. Image Feature Extraction using 2D-PCA
P (i | X ) rk Pk (i | X ) (3)
k 1
3.1 Two Dimensional Principal Component Analysis
(2D-PCA)
Where, rk is the reliability coefficient of the kth Sub Neural
Network. For example, the following model uses Genetics Assume that the training data set consists of N face images
Algorithm to create these reliability coefficients. with size of m x n. X1, X2, XN are the matrices of
sample images. The 2D-PCA proposed by Yang et al. [2]
is as follows:
Step 1. Obtain the average image X of all training
samples:
1
N
X X
N i 1 i
(4)
Fig 3. Facial Expression Classification Result (FL) and Genetics Algorithm (GA), PhD Mathematics
Thesis, University of Science, Ho Chi Minh City, Vietnam,
It is a small experimental to check MANN model and need 2004.
to improve our experimental system. Although the result [5] H. B. Le, and H. T. Le, the GA_NN_FL associated model
classification is not high, the improvement of combination for authenticating finger printer, in the Knowledge-Based
result shows the MANNs feasibility such a new method Intelligent Information & Engineering Systems, Wellington
combines. Institute of Technology, New Zealand, 2004.
[6] A. Ghoshal, P. Ircing, and S. Khudanpur, Hidden Markov
We need to integrate with another facial feature sequences models for automatic annotation and content-based retrieval
extraction system to increase the classification precision. of images and video, in the 28th annual international ACM
SIGIR conference on Research and development in
information retrieval , 2005, pp. 544-551.
5. Conclusion [7] Y. Chen, and J. Z. Wang, A region-based fuzzy feature
matching approach to content-based image retrieval,
Pattern Analysis and Machine Intelligence, IEEE
In this paper, we explain 2D-PCA and DiaPCA for facial
Transactions on , 2002, pp. 1252-1267
feature extraction. These features are the input of our
proposal model Multi Artificial Neural Network (MANN) [8] D. Hoiem, R. Sukthankar, H. Schneiderman, and L. Huston,
with parameters (m, L). In particular, m is the number of Object-based image retrieval using the statistical structure
of images, in Computer Vision and Pattern Recognition,
images feature vectors. L is the number of classes.
IEEE Computer Society Conference, 2004, Vol. 2, pp. II-
MANN model has m Sub-Neural Network SNNi (i=1..m) 490-II-497.
and a Global Frame (GF) consisting L Components Neural
Network CNNj (j=1..L). [9] S. Y. Cho, and Z. Chi, Genetic Evolution Processing of
Data Structure for Image Classification, in IEEE
Transaction on Knowledge and Data Engineering
Each of SNN uses to process the responsive feature vector. Conference, 2005, Vol 17, No 2, pp. 216-231
Each of CNN use to combine the responsive element of
SNNs output vector. The weight coefficients in CNNj are [10] C. M. Bishop, Pattern Recognition and Machine
as the reliability coefficients the SNN(s) the jth output. It Learning, Springer: Press, 2006.
means that the importance of the ever feature vector is [11] M. J. Lyons, J. Budynek, and S. Akamatsu, Automatic
determined after the training process. On the other hand, it Classification of Single Facial Images, in IEEE
depends on the image database and the desired Transactions on Pattern Analysis and Machine Intelligence,
classification. This MANN model applies for image 1999, Vol. 21, pp.1357-1362
classification. [12] J. Yang, D. Zhang, A. F. Frangi, and J.-y. Yang, Two-
dimensional PCA: a new approach to appearance-based face
To experience the feasibility of MANN model, in this representation and recognition, IEEE Transactions on
research, we propose the MANN model with parameters Pattern Analysis and Machine Intelligence, 2004, Vol 26,
pp. 131-137, 2004
(m=4, L=3) apply for six basic facial expressions and test
on JAFFE database. The experimental result shows that [13] D. Zhang, Z.-H. Zhou, and S. Chen, Diagonal principal
the proposed model improves the classified result component analysis for face recognition, Pattern
Recognition, 206, Vol. 39, pp. 140-142.
compared with the selection and average combination
method.
Dr Le Hoang Thai received B.S degree and M.S degree in
References Computer Science from Hanoi University of Technology, Vietnam,
[1] S. Tong, and E. Chang, Support vector machine active in 1995 and 1997. He received Ph.D. degree in Computer Science
learning for image retrieval, in the ninth ACM from Ho Chi Minh University of Sciences, Vietnam, in 2004. Since
international conference on Multimedia, 2001, pp. 107-118. 1999, he has been a lecturer at Faculty of Information Technology,
Ho Chi Minh University of Natural Sciences, Vietnam. His research
[2] R. Brown, and B. Pham, Image Mining and Retrieval interests include soft computing pattern recognition, image
Using Hierarchical Support Vector Machines, in the 11th processing, biometric and computer vision. Dr. Le Hoang Thai is
International Multimedia Modeling Conference (MMM'05), co-author over twenty five papers in international journals and
international conferences.
2005, Vol. 00, pp. 446-451.
[3] M. A. Turk and A. P. Penland, Face recognition using
eigenfaces, IEEE Int. Conf. of Computer Vision and
Pattern Recognition, 1991, pp. 586-591.
[4] H.T Le, Building, Development and Application Some
Combination Models of Neural Network (NN), Fuzzy Logic Tat Quang Phat received B.S degree from Binh Duong University,
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 26
2
School of Computer Engineering, University of Castilla y la Mancha
Ciudad Real, Spain
the packet to travel through the tunnel [6]. The packets 3.2 Design Considerations
that are sent through the LSP tunnel constitute a FEC.
We give the design considerations for the PM2PLS
architecture in this subsection.
3.3 Architecture Components verifies if it is assigned the MNs PCoA to a FEC (there
are LSP tunnel between LMA and MNs MAG). If an
The architecture components shown in Figure 3 are entry already exists with the MN-PCoA as FEC, it does
described. Figure 4 gives the protocol stack of PM2PLS not need to setup the LSP, since a LSP Tunnel already
entities and the signaling flow between them when a exists, If not a RSVP Path message are generated from
handover occurs is shown in Figure 5. LMA to MAG to setup the LSP between LMA and MAG.
MAG/LER: It is an entity which has the MAG (from When the LSP setup process is finished (Path and Resv
PMIPv6) and LER (from MPLS) functionality inside RSVP messages are received and processed) and the LMA
its protocol stack. had assigned a label to that FEC, it should have a entry in
LMA/LER: It is an entity which has the LMA (from the LFIB with the FEC assign to the tunnel between LMA
PMIPv6) and LER (from MPLS) functionality inside and MAG. Periodically, the LSP capability should be
its protocol stack. evaluated in order to assure that the traffic across the LSP
LSR: It is a MPLS router as specified in [6]. is being satisfied.
MN: It is a mobile node which implements IPv6.
CN: It is a mobile/fixed node which implements IPv6
or IPv4.
LMA-MAG1 15 1 - 2
3.7 Example of LFIBs in PM2PLS Nodes
MAG1-LMA 40 2 35 1
MAG3-LMA 60 2 - 1
4. Performance Analysis
In this section we analyze the performance of PM2PLS on 4.1 Handover Process in 802.11
802.11 Wireless LAN (WLAN) access network based on
handover delay, attachment delay, operational overhead In order to study the handover performance of PM2PLS,
and packet loss during handover. We compared our we consider an 802.11 WLAN access to calculate the L2
proposal with single PMIPv6 and PMIPv6/MPLS in an handover delay (that is when a MN attaches to a new
encapsulated way as proposed in [8]. Access Point (AP)). During the handover at layer two, the
station cannot communicate with its current AP. The IEEE
802.11 handover procedure involves at least three entities:
the Station (MN in PM2PLS), the Old AP and the New AP.
It is executed in three phases: Scanning (Active or
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 43
and it initializes the scanning phase. There are two tx,y Time required for a message to pass N/A
through links from node x to node y.
methods in this phase: Active and Passive. In the passive tWL Wireless link delay. 10 ms [4]
method the station only waits to hear periodic beacons
tScanning Delay due to scanning phase of 802.11. 100 ms [16]
transmitted by neighbour APs in the new channel, in the
active one, the station also sends probe message on each TREG Registration or binding update delay. N/A
channel in its list and receives response of APs in its tPBU Time of Proxy Binding Update message N/A
coverage range. When the station finds a new AP, it sends tPBA Time of Proxy Binding Acknowledgment N/A
an authentication message, and once authenticated can message
send the re-association message. In this last phase includes TMD Mobility detection delay. 0 ms
the IAPP (Inter Access Point Protocol) [17] procedure to TL3HO L3 handover delay. N/A
transfer context between Old AP and New AP. TL2HO L2 handover delay. 115 ms [4]
170
PR Send packet ratio
packets/sec [19]
TAAA = tAAA-Req. + tAAA-Resp. + AAA-Server, (3) TL3HO in PMIPv6 is as in (2), with TAAA as in (3), TREG as
the binding update delay can be expressed as: in (11) and TRA as in (16). As mentioned above during a
TREG = tPBU + tPBA + LMA + MAG (4) PMIPv6 handover is not executed neither Movement
Detection (MD) nor Address Configuration (Included
where DAD).
tPBU = tMAG,LMA+ (n) RP (5)
t MAG,LMA (6) 4.3 Packet Loss During Handover
tPBA = tLMA,MAG + (m) RP (7) Packet Loss (PL) is defined as the sum of lost packets per
MN during a handover. With (20) we can calculate the PL
t LMA,MAG (8) in a handover for a given MN.
finally,
PL T PR (20)
TREG = + (n+m) RP + LMA+ MAG.
(9)
4.4 Operational Overhead
When a bidirectional LSP is not established between
MAG and LMA TL3HO can be calculated as follows: The operational overhead of PM2PLS is 4 bytes per packet
(MPLS header size). PM2PLS reduces significantly the
TL3HO = TAAA + TREG + TBi-LSP-Setup + TRA, (10) operational overhead with respect to PMIPv6 which has an
where TAAA is the same as in (3), TRA is the same as in operational overhead of 40 bytes when uses IPv4 or IPv6
(16), and from (9) TREG can be expressed as: in IPv6 encapsulation (over IPv6 Transport Network), 20
bytes of overhead when uses IPv4 or IPv6 in IPv4
TREG = + (n+m) RP + encapsulation (over IPv4 Transport Network), 44 bytes
LMA + MAG. (11) when uses GRE tunnel over TN IPv6, or 24 bytes when
The latency introduced by LSP setup between the LMA uses GRE tunnel over IPv4 TN. A comparison of
and the MAG and vice versa (TBi-LSP-Setup) in PM2PLS can operational overhead between above schemes is
be expressed as the delay of one LSP setup, since the summarized in Table 9.
LMA initializes LSP setup between LMA and MAG after
accepting PBU and sending PBA to the MAG (The LMA Table 9: Operational Overhead
Scheme and Tummeling Overhead
does not need to wait nothing else). When PBA arrives to Mechanism per Packet
Description
the MAG, it initializes the LSP setup with LMA. We PMIPv6 with IPv6 in IPv6 40 IPv6 header
assume that when a LSP setup between MAG and LMA Tunnel
PMIPv6 with IPv4 in IPv6 40 IPv6 header
finishes, the LSP between LMA and MAG is already Tunnel
established, since it initialized before MAG to LMA LSP: PMIPv6 with IPv6 in IPv4 20 IPv4 header
Tunnel
TBi-LSP-Setup = tRSVP-Resv + tRSVP-Path (12) PMIPv6 with IPv4 in IPv4 20 IPv4 header
Tunnel
PMIPv6 with GRE encapsulation IPv6 header +
where (over TN IPv6)
44
GRE header
tRSVP-Resv = tMAG,LMA + (n) RP, (13) PMIPv6 with GRE encapsulation IPv4 header +
24
(over TN IPv4) GRE header
tRSVP-Path = tLMA,MAG + (m) RP, (14) PMIPv6/MPLS with VP Label 8 2 MPLS headers
(over TN IPv4 or IPv6)
tMAG,LMA and tLMA,MAG are as in (6) and (8) respectively. 2
PM PLS (over TN IPv4 or IPv6) 4 MPLS headers
Finally, TBI-LSP-Setup can be expressed as:
TBi-LSP-Setup = + (n+m) RP . 4.5 Simulation Results
(15)
We compared PM2PLS, PMIPv6 [5] and PMIPv6/MPLS
The delay by router advertisement message can be as proposed in [8]. We use typical values for parameters
expressed as: involved in above equations as shown in Table 8. Figure 6
shows the impact of hops between the MAG and the LMA
TRA = tAP-MAG + tWL. (16) in the handover delay. It can be observed that the handover
The L2 handover delay in an 802.11 WLAN access delay increases with the number of hops. PMIPv6/MPLS
network can be expressed as: is the scheme most affected by the number of hops
TL2HO = tScanning + tAutentication + tAssocciation (17) because it integrates the LSP setup in encapsulated way
and does not optimize this process. PMIPv6 and PM2PLS
with a bidirectional LSP established between new MAP
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 45
and LMA shown a comparable performance with slightly an optimized sequential way; we also used the LSP
better response of PM2PLS when the number of hops established between the MAG and the LMA for sending
increase because binding update messages (i.e. PBU and PBU and PBA messages when it exists. We compared the
PBA) are sent through bidirectional LSP established performance of PM2PLS with single PMIPv6 and
between the MAG and the LMA instead of using IP PMIPv6/MPLS as specified in [8]. We demonstrated that
forwarding. Figure 7 shows the total packet loss during PM2PLS has a lower handover delay than PMIPv6/MPLS,
handover for above schemes. Since packet loss during and slightly lower than the one of PMIPv6. The
handover is proportional to the handover latency, PM2PLS operational overhead in MPLS-based schemes is lower
also have the lowest packet loss ratio between compared than single PMIPv6 schemes since uses LSPs instead of IP
schemes. For doing the packet loss simulation we consider tunnelling. With MPLS integrated in a PMIPv6 domain,
a flow of VoIP [19]. the access network can use intrinsic Quality of Service and
Traffic Engineering capabilities of MPLS. It also allows
the future use of DiffServ and/or IntServ in a
PMIPv6/MPLS domain.
Acknowledgments
References
[1] Johnson D., Perkins C., and Arkko J., Mobility Support in
Fig. 7 802.11 handover process IPv6, IETF RFC 3775 (Proposed Standard), June 2004.
[2] Soliman H., Castellucia C., ElMalki K., and Bellier L.,
"Hierarchical Mobile IPv6 (HMIPv6) Mobility
Management," IETF RFC 5380 (Proposed Standard),
October 2008.
[3] Koodli R., "Mobile IPv6 Fast Handovers," IETF RFC 5568
(Proposed Standard), July 2009.
[4] Kong K.-S., Lee W., Han Y.-H., Shin M.-k., and You H.,
"Mobility Management for All-IP Mobile Networks:
Mobile IPv6 vs. Proxy Mobile IPv6," IEEE Wireless
Communications, pp. 36-45, April 2008.
[5] Gundavelli S., Leung K., Devarapalli V., and Chowdh K.,
PMIPv6 "Proxy Mobile IPv6," IETF RFC 5213 (Proposed
PMIPv6/MPLS with bidirectional LSP established Standard), August 2008.
PMIPv6/MPLS without bidirectional LSP established
PM2PLS with bidirectional LSP established [6] Rosen E., Viswanathan A., and Callon R., "Multiprotocol
PM2PLS without bidirectional LSP established Label Switching Architecture," IETF RFC 3031 (Proposed
Standard), January 2001.
[7] Xia F. and Sarikaya B., "MPLS Tunnel Support for Proxy
Mobile IPv6," IETF Draft, October 25, 2008.
Fig. 8 Packet loss of PMIPv6, PMIPv6/MPLS, and PM2PLS during a
[8] Garroppo R., Giordano S., and Tavanti L., "Network-based
handover.
micro-mobility in wireless mesh networks: is MPLS
convenient," in Proceedings of Global Communications
Conference (GLOBECOM), December 2009.
Conclusions
[9] Liang J. Z., Zhang X., and Li Q., "A Mobility Management
Based on Proxy MIPv6 and MPLS in Aeronautical
We proposed an integration of MPLS and PMIPv6 called Telecommunications Network," in Proceedings of 2009
PM2PLS which optimizes the bidirectional LSP setup by First International Conference on Information Science and
integrating binding updates and bidirectional LSP setup in Engineering (ICISE), 2009, pp. 2452-2455.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 46
[10] Carmona-Murillo J., Gonzles-Snchez J. L., and Corts- Jess H. Ortiz received his BSc. in Mathematical from the
Polo D., "Mobility management in MPLS-based access Santiago de Cali University, Colombia, Bsc. in Electrical
Engineering from the University of Valle, Colombia and his PhD
networks. An analytical study," in Proceedings of IX degree in Computer Engineering from the University of Castilla y la
Workshop in MPLS/GMPLS networks, July 2009. Mancha, Spain, in 1988, 1992 and 1998 respectively. Currently, he
[11] Vassiliou V., "Design Considerations for Introducing is assistant professor in the Universidad of Castilla y la Mancha,
Micromobility in MPLS," in Proceedings of the 11th IEEE Spain in the area of Computer and Mobile Networks. He is
reviewer and/or editor of several journals such as IAJIT, IJRRCS,
Symposium on Computers and Communications (ISCC'06), IJCNIS, JSAT, and ELSEVIER.
June 2006.
[12] Awduche D., et al., "RSVP-TE: Extensions to RSVP for
LSP Tunnels," IETF RFC 3209 (Proposed Standard),
December 2001.
[13] Muhanna A., Khalil M., Gundavelli S., and Leung K.,
"Generic Routing Encapsulation (GRE) Key Option for
Proxy Mobile IPv6," IETF RFC 5845 (Proposed Standard),
June 2010.
[14] Wikikawa R. and Gunsavelli S., "IPv4 support for proxy
Mobile IPv6," IETF Draft, May 2008.
[15] Andersson L. and Asati R., "Multiprotocol Label
Switching (MPLS) Label Stack Entry: "EXP" Field
Renamed to "Traffic Class" Field," IETF RFC 5462
(Proposed Standard), February 2009.
[16] Mishara A., Shin M., and Arbaugh W., "An Empirical
Analysis of the IEEE 802.11 MAC Layer Handoff
Process," ACM SIGCOMM Computer Communication
Review , vol. 33, no. 2, pp. 93-102, April 2003.
[17] IEEE Trial-Use Recommended Practice for Multi-Vendor
Access Point Interoperability Via an Inter-Access Point
Protocol Across Distribution Systems Supporting IEEE
802.11 Operation, IEEE Std 802.11F-2003.
[18] Diab A., Mitschele-Thiel A., Getov K., and Blume O.,
"Analysis of Proxy MIPv6 Performance comparated to
Fast MIPv6," in Proceedings of 33rd IEEE Conference on
Local Computer Networks 2008 (LCN 2008), pp. 579-580,
2008.
[19] Hasib M., "Analysis of Packet Loss Probing in Packet
Networks," Doctoral Thesis, Queen Mary, University of
London, June 2006.
2
Information Science and Control Engineering, Nagaoka University of Technology
Nagaoka, Niigata 940-2188, Japan
3
Foreign Language Education Center, University of Miskolc
Miskolc, Egyetemvaros H3515 Hungary
Abstract
Language identification of written text in the domain of Latin-
script based languages is a well-studied research field. However,
new challenges arise when it is applied to non-Latin-script based
languages, especially for Asian languages' web pages. The
objective of this paper is to propose and evaluate the
effectiveness of adapting Universal Declaration of Human Rights
and Biblical texts as a training corpus, together with two new
heuristics to improve an n-gram based language identification
algorithm for Asian languages. Extension of the training corpus
produced improved accuracy. Improvement was also achieved by
using byte-sequence based HTML parser and a HTML character Figure 1 Articles count and number of languages (Latin-script and non-
entities converter. The performance of the algorithm was Latin-script based) on Wikipedia's language projects, 2001 to 2008.
evaluated based on a written text corpus of 1,660 web pages,
spanning 182 languages from Asia, Africa, the Americas, Europe 1.1 Unreliable HTML and XMLs Language
and Oceania. Experimental result showed that the algorithm
achieved a language identification accuracy rate of 94.04%.
Attribute
Keywords: Asian Language, Byte-Sequences, HTML Character The Hyper Text Markup Language (HTML) is the
Entities, N-gram, Non-Latin-Script, Language Identification.
standard encoding scheme used to create and format a web
page. In the latest HTML 4.01 specification, there is a lang
1. Introduction attribute that defined to specify the base language of text
in a web page. Similarly, the Extensible Markup Language
With the explosion of multi-lingual data on the Internet, (XML) 1.0 specification includes a special attribute named
the need and demand for an effective automated language xml:lang that may be inserted into documents to specify
identifier for web pages is further increased. Wikipedia, a the language used in the contents. However, the reality
rapidly growing multilingual Web-based encyclopedia on remains that many web pages do not make use of this
the Internet, can serve as a measure of the multilingualism attribute or, even worse, use it incorrectly and provide
of the Internet. We can see that the number of web pages misleading information.
and languages (both Latin-script and non-Latin-script
based) has increased tremendously in recent years, as Using the validation corpus in this study as a sample, we
shown in Figure 1. found that only 698 web pages out of 1,660 contain lang
attribute, as shown in Table 1. When lang attribute is
available, it does not always indicate the correct language
of a web page. Table 1 shows that 72.49% of web pages
with lang attribute produced correct language indication.
Overall, only 30.48% of web pages in our sample
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No.1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 48
produced correct language identification result from lang The third approach generates a language model based on
attribute. Therefore, we are left with deducing information "n-gram". An n-gram is a subsequence of N items from a
from the text to determine the language of a given web given sequence. [William B. Cavnar 1994] [Grefenstette
page. This is the domain of language identification. 1995] [Prager 1999] used a character-sequence based n-
gram method, while [Dunning 1994] used a byte-sequence
Table 1 Number of web pages with lang attribute and percentage of based n-gram method.
correct language identification using lang attribute as indicator, based on
validation corpus of this study.
The generated language model is used as the input for
Correct Total Pages Percent
Pages Correct language classification method. Many language
classification methods had been proposed before, these
Web pages with 506 698 72.49%
lang attribute include Ad-Hoc Ranking [William B. Cavnar 1994],
Web pages without 0 962 0.00% Markov Chains in combination with Bayesian Decision
lang attribute Rules [Dunning 1994], Relative Entropy [Penelope Sibun
Total 506 1660 30.48% 1996], Vector Space Model [Prager 1999] and Monte
Carlo sampling [Poutsma 2001].
1.2 Language Identification
Table 2 shows the information of five selected studies.
Language identification is the fundamental requirement Previous studies reported excellent results on a few
prior to any language based processing. For example, in a selected Latin-script based languages. Japanese and
fully automatic machine translation system, language Russian are the only two exceptional here. The Japanese
identification is needed to detect the source language language, written with the Japanese logographs and
correctly before the source text can be translated to another syllabaries, and Russian, written in the Cyrillic script, can
language. Many studies of language identification on be easily distinguished from the Latin-script based
written text exists, for example, [Gold 1967] [William B. languages, and also from each other. However, the
Cavnar 1994] [Dunning 1994] [Clive Souter 1994] performance of language identification on non-Latin-script
[Michael John Martino 2001] [Izumi Suzuki 2002] based languages remains unknown.
[LVECK 2005] [Bruno Martins 2005], just to name a
few. Most studies in Table 2 are focusing on plain text content.
There is only two previous study evaluate its language
A comparative study on language identification methods identification algorithm against web page. Although the
for written text was reported in [Lena Grothe 2008]. Their proposed heuristics work well on Latin-script based web
paper compares three different approaches to generate page, they might not able to effectively handling the non-
language models and five different methods for language Latin-script based web page. Usually, non-Latin-script has
classification. different bits setting, while many non-Latin-scripts in Asia
are encoded in legacy fonts. Besides, none of the studies
The first approach generates language model based on mentioned about HTML entities, which indeed is
"short words". It uses only words up to a specific length to commonly used in non-Latin-script based web page.
construct the language model. The idea behind this
approach is that language specific common words having As previous studies are focusing on Latin-script based
mostly only marginal length. [Grefenstette 1995] languages, most of them adopted a training corpus with
tokenized and extracted all words with a length up to five limited number of Latin-script based languages only. Thus,
characters that occurred at least three times from one our research aims to improve language identification on a
million characters of text for ten European languages. broader range of languages, especially for non-Latin-script
[Prager 1999] used still shorter words four or fewer and added support for web page content. The initial target
characters, for thirteen Western European languages. is set at the 185 languages given in ISO 639-1.
The second approach generates language model is based 1.3 Hyper Text Markup Language and HTML Parser
on "frequent words". It uses a specified number of the
[Penelope Sibun 1996] states that language identification
most frequent words occurring in a text to construct the
is a straightforward task. We argue that their claim is only
language model. For instance, the most frequent one
true for language identification on Latin-script based plain
hundred words were used in [Clive Souter 1994] and
text document. Web pages are different from plain text
[Michael John Martino 2001], while [Eugene Ludovik
documents since they contain the HTML tags that are used
1999] used the most frequent one thousand words.
to publish the document on the Web. In order to correctly
identify the language of a web page, a HTML parser is
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 49
Table 2 Five selected language identification studies on written text with information of languages coverage, training corpus, validation corpus and
accuracy of identification.
Research Language Coverage Training Corpus Validation Corpus Percent Correct
[William B. Cavnar English, Portuguese, Unspecified 3713 text sample from 99.8%
1994] French, German, Italian, soc.culture newsgroup
Spanish, Dutch, Polish
[Dunning 1994] Dutch, Polish A set of text samples Another set of text 99.9%
from Consortium for samples from
Lexical Research Consortium for Lexical
Research
[Clive Souter 1994] Dutch/Friesian, English, A set of text samples Another set of text 94.0%
French, Gaelic, German, from Oxford Text samples from Oxford
Italian, Portuguese, Archive, each is 100 Text Archive
Serbo-Croat, Spanish kilobytes
[Poutsma 2001] Danish, Dutch, English, 90% of text samples 10% of text samples Result in chart format
French, German, Italian, from European Corpus from European Corpus
Norwegian, Portuguese, Initiative Multilingual Initiative Multilingual
Spanish, Swedish Corpus Corpus
[Bruno Martins Danish, Dutch, English, Text samples of 23 Web pages of 12 91.25%
2005] Finnish, French, languages collected from languages collected from
German, Italian, newsgroups and the Web newsgroups and the Web
Japanese, Portuguese,
Russian, Spanish,
Swedish
needed in order to remove the HTML tags and to extract encoding information. Table 3 shows an example of text
the text content for language identification. rendered by wrongly character encoding. The authors only
show one example as the reason for wrong character
An HTML parser usually processes text based on character encoding is identical.
sequences. The HTML parser read the content of a web
page into character sequences, and then marked the blocks 1.4 Unicode and HTML Character Entities
of HTML tags and the blocks of text content. At this stage,
the HTML parser uses a character encoding scheme to Unicode is a computing industry standard that allowing
encode the text. HTML parser usually depends on a few computers to represent and manipulate text expressed in
methods (describes in subsection Character and Byte- most of the world's writing systems. The Unicode
sequence based HTML Parser) to determine the correct Consortium has the ambitious goal of eventually replacing
character encoding scheme to be used. If no valid existing character encoding schemes with Unicode, as
character encoding is detected, the parser will apply a many of the existing schemes are limited in size and scope.
predefined default encoding. Unicode characters can be directly input into a web page if
the user's system supports them. If not, HTML character
Today, a common approach is to use UTF-8 (a variable- entities provide an alternate way of entering Unicode
length character encoding for Unicode) as the default characters into a web page.
encoding, as the first 128 characters of Unicode map
directly to their ASCII correspondents. However, using There are two types of HTML character entities. The first
UTF-8 encoding on non-Latin-script based web pages type is called character entity references, which take the
might cause the application to apply a wrong character form &EntityName;. An example is © for the
encoding scheme and thus return an encoded text that is copyright symbol. The second type is referred as numeric
different from its web origin. character references, which takes the form &#N;, where N
is either a decimal number (base 10) or a hexadecimal
Using the validation corpus of this study as an example, number for the Unicode code point. When N represents a
we found that 191 web pages were with doubtful character hexadecimal number, it must be prefixed by x. An
Table 3 Text rendered and language identification results on a selected web page with misleading charset information.
Web Page HTML Parser (Character-sequence based) Web Origin
Detected Charset Text Identified As Text Rendered Identified As
Rendered
chinese-05- No Match, use ???? English, Latin, Latin1 Chinese, Simplified
newscn.htm default UTF-8 Chinese,GB2312
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No.1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 50
examples of these entities is 和 (base 10) or vi. Handle multilingualism and the "hard to decide"
和 (base 16) for the Chinese and also Japanese cases. When a document cannot be clearly
character "". classified to one language, the system will re-
apply the algorithm, and weight the largest text
Using HTML character entities, any system is able to input block as three times more important than the rest.
Unicode characters into a web page. However, this causes
a problem for language identification as the language In the experiment, they constructed 23 different language
property is now represented by label and numeric models from textual information extracted from
references. In order to identify the language of an HTML newsgroups and the Web. They tested the algorithm using
character-entity-encoded web page, we propose a HTML testing data in 12 different languages, namely Danish,
character entity converter to translate such entities to the Dutch, English, Finnish, French, German, Italian, Japanese,
byte sequences of its corresponding Unicode code point. Portuguese, Russian, Spanish and Swedish, respectively.
The total number of documents for testing is 6,000, with
1.5 Organization of this paper 500 documents for each language. The testing data were
crawled from on-line newspapers and Web portals. Overall,
The remaining of this paper is ordered in the following the best identification result returned accuracy of 91.25%,
structure. The authors review related works in the next which was lower than other researches on text document.
section. In Methodology section, the authors describe the The authors believe that this is due to the much noisier
language identification process and the new heuristics. In nature of the text in web page.
Data and Experiments section, the authors explain the
nature and preparation of training and validation corpus; 2.2 Suzuki Algorithm
followed by description on how the experiments are setup
and the purposes of them. In the Result and Discussion In [Izumi Suzuki 2002], the method is different from
section, the authors present the results from the conventional n gram based methods in the way that its
experiments. In the last section, the authors draw threshold for any categories is uniquely predetermined.
conclusions and propose a few areas for future work. For every identification task on target text, the method
must be able to respond to either correct answer or
unable to detect. The authors used two predetermined
2. Related Work values to decide which answer should respond to a
language identification task. The two predetermined values
2.1 Martin Algorithm are UB (closer to the value 1) and LB (not close to the
value 1), with a standard value of 0.95 and 0.92,
In [Bruno Martins 2005], the authors discussed the respectively. The basic unit used in this algorithm is
problem of automatically identifying the language of a trigram. However, the authors refer to it as a 3-byte shift-
given web page. They claimed that web page is generally codon.
contained more spelling errors, multilingual and short text,
therefore, it is harder for language identification on the In order to detect the correct language of a target text, the
web pages. They adapted the well-known n-gram based algorithm will generate a list of shift-codons from the
algorithm from [William B. Cavnar 1994], complemented target text. The targets shift-codons will then compare to
it with a more efficient similarity measure [Lin 1998] and the list of shift-codons in training texts. If one of the
heuristics to better handle the web pages. The heuristics matching rates is greater than UB, while the rest is less
included the following six steps: than LB, the algorithm will report that a correct answer
i. Extract the text, the markup information, and has been found. The language of the training text with
meta-data. matching rate greater than UB is assumed to be language
ii. Use meta-data information, if available and valid. of the target text. By this method, the algorithm correctly
iii. Filter common or automatically generated strings. identified all test data of English, German, Portuguese and
For example, "This page uses frames". Romanian languages. However, it failed to correctly
iv. Weight n-grams according to HTML markup. For identify the Spanish test data.
example, n-grams in the title section have more
weight than n-grams in meta-data section.
v. Handle situations when there is insufficient data. 3. Methodology
When a web page has less than 40 characters, the
system reports "unknown language".
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 51
The general paradigm of language identification can be language model from text document into trigrams of byte
divided into two stages. First, a set of language model is sequences. For example, the trigrams for the Japanese
generated from a training corpus during the training phase. word "" (or 82 B1 82 F1 82 C9 82 BF 82 CD
Second, the system constructs a language model from the in the Shift-JIS character encoding scheme) are
target document and compares it to all trained language highlighted as follows:
models, in order to identify the language of the target 82 B1 82 F1 82 C9 82 BF 82 CD
document during the identification phase. The algorithm 82 B1 82 F1 82 C9 82 BF 82 CD
used in this study adopted this general paradigm; however, 82 B1 82 F1 82 C9 82 BF 82 CD
it contains two new heuristics to properly handle web 82 B1 82 F1 82 C9 82 BF 82 CD
pages. The first heuristic is to remove HTML tags in byte- 82 B1 82 F1 82 C9 82 BF 82 CD
sequence stream. The second heuristics is to translate 82 B1 82 F1 82 C9 82 BF 82 CD
HTML character entities to byte sequences of their 82 B1 82 F1 82 C9 82 BF 82 CD
Unicode code point. The algorithm only takes text and 82 B1 82 F1 82 C9 82 BF 82 CD
HTML documents as valid input. The overall system flow
of language identification process is shown in Figure 2. The language classification method is based on trigram
frequency. The trigram distribution vector of training
document has no frequency information. Only the target
document has a frequency-weighted vector. In order to
detect the correct language of a target document, the
algorithm will generates a list of byte-sequence based
trigrams from the target document, together with the
frequency information of each trigram. The target
document's trigrams will then be compared to the list of
byte-sequence based trigrams in every training language
model. If a target's trigram matches a trigram in the
training language model, its frequency value is added to
the matching counter. After all trigrams from target
document have been compared to trigrams in training
language model, the matching rate is calculated by
dividing the final matching counter by the total number of
target's trigrams.
An n-gram is a sub sequence of N items from a longer 3.2 Character and Byte-sequence based HTML
sequence. An n-gram order 1 (i.e. N=1) is referred to as a Parser
monogram; n-gram order 2 as a bi-gram and n-gram order
3 as a trigram. Any other is generally referred to as "N- In order to correctly process a web page, a HTML parser
gram". This paper adapted the n-gram based algorithm must ascertain what character encoding scheme is used to
proposed by [Izumi Suzuki 2002]. The algorithm generates encode the content. This section describes how to detect
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No.1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 52
the character encoding in Hypertext Transfer Protocol character encoding information, especially on non-Latin-
(HTTP) header, XML or HTML. script web pages.
When a web page is transmitted via the HTTP, the Web The HTML parser implemented in this paper is unique in
server will sent the character encoding in the content-type that it processes the content of a web page based on byte
field of the HTTP header, such as content-type :text/html; sequences, thus avoiding the above mentioned problem.
charset=UTF-8. The character encoding can also be By using byte sequences, it eliminates the need to detect
declared within the web page itself. For XML, the and apply character encoding scheme on the content
declaration is at the beginning of the markup, for instance, extracted from the web page. The HTML parser parses the
<?xml version="1.0" encoding="utf-8"?> for HTML, the web page in a linear fashion. It searches for HTML tags
declaration is within the <meta> element, such as <meta from the beginning to the end of page. It looks for valid
http-equiv="content-type" content="text/html; HTML start and end tags and marks all blocks of HTML
charset=UTF-8">. If there is no valid character encoding tags. The parser removes all detected HTML blocks and
information detected, a predefined character encoding return remaining content in byte sequences for language
scheme will be invoked. The default character encoding identification. The parser searches in sequence of bytes
scheme varies depending on the localization of the instead of characters. For example, in order to determine
application. In the case of conflict between multiple the locations of <body> and </body> tags in a web page,
encoding declarations, precedence rules apply to determine the parser searches for 3C 62 6F 64 79 3E and 3C 2F 62
which declaration shall be used. The precedence is as 6F 64 79 3E, respectively. The parser keeps a list of byte-
follows, with HTTP content-type being the highest priority: sequence based HTML tags and uses them to remove
i. HTTP content-type HTML tag's blocks from the target web page.
ii. XML declaration
iii. HTML Meta charset element 3.3 HTML Character Entity Converter
Since information in the HTTP header overrides The HTML character entity converter is designed to
information in the web page, it is therefore important to translate HTML entities to corresponding byte sequences
ensure that the character encoding sent by the Web server of Unicode's code point. The converter is able to handle
is correct. However, in order to serve file or files using a both character entity references and numeric character
different encoding than that specified in the Web server's references. There are 252 character entity references
default encoding, most Web serves allow the user to defined in HTML version 4, which act as mnemonic
override the default encoding defined in HTTP content- aliases for certain characters. Our converter maintains a
type. Table 4 illustrates all possible scenarios of character mapping table between the 252 character entity references
encoding scheme determination. and their represented byte sequences in hexadecimal
number. When a character entity reference is detected by
Table 4 shows that misleading and missing character the converter, it replaces the entity with its associated byte
encoding information would probably lead to the wrong sequences.
result. Therefore, it is quite possible that a character-
sequence based HTML parser might apply an incorrect For numeric character references, the converter performs a
character encoding scheme to web pages without valid real time decoding process on it. The converter will
convert the character reference from decimal (base 10)
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 53
number to byte sequences if it detects the following pattern: Observatory Project (LOP). UDHR was selected as it is
character ampersand (&), followed by character number the most translated document in the world, according to
sign (#), followed by one or more decimal digits (zero the Guinness Book of Records.
through nine), and lastly followed by character semicolon
(;). For example, A (representing the Latin capital The OHCHR web site contained 394 translations in
letter A). various languages. However, 80 of them are in Portable
Document Format (PDF). As a result, only 314 languages
Similarly, the converter will convert the character were collected from OHCHR. The LOP contributed 18
reference from hexadecimal (base 16) number to byte new languages. The total size of the first set of training
sequences if it detects the following pattern: character data is 15,241,782 bytes. Individual file size ranged from
ampersand (&), followed by character number sign (#), 4,012 to 55,059 bytes. From here onward this set of
followed by character (x), followed by one or more training data will be referred to as training corpus A.
hexadecimal digits (which are zero through nine, Latin
capital letter A through F, and Latin small letter a through The second set of training data, training corpus B,
f), and lastly followed by character semicolon (;). For increases the number of languages by 33. It contains 65
example, A (again representing the Latin capital (some are same language but in different encoding
letter A). schemes) Biblical texts collected from the United Bible
Societies (UBS). All files have similar content, but written
Table 5 shows the byte sequences output by the HTML in different languages, scripts and encodings. The total
character entities converter, using an ampersand sign (&), size of the second set of training data is 1,232,322 bytes.
a Greek small letter beta () and a Chinese character "" Individual file size ranged from 613 to 54,896 bytes.
as examples. These examples are carefully selected to
show the different ways of conversion based on different Most languages have more than one training file in the
number of byte order in UTF-8. training corpora. This is because the same language can be
written in different scripts and encodings. For example, the
Chinese language has five training files in training corpus
4. Data and Experiments A. The five training files by language_script_encoding are:
Chinese_Simplified_EUC-CN, Chinese_Simplified_HZ,
There are two sets of data used in this study. The first set Chinese_Simplified_UTF8, Chinese_Traditional_BIG5
is the training corpus, which contains training data used to and Chinese_Traditional_UTF8. Likewise, a language
train the language models. The second set is the validation might be covered by texts in training corpus A and B.
corpus, which is a collection of web pages used as target
documents in the experiments. Table 6 shows the number of languages, scripts, encodings
and user-defined fonts of the training corpora, sorted
4.1 Training Corpus according to geographical regions. The column header (A
B) represents the distinct number of languages, scripts,
In this paper, the authors prepared two sets of training data. encodings and fonts in the corpora.
The first set of training data is constructed from 565
Universal Declaration of Human Rights (UDHR) texts From Table 6, we can observe that the Asian region is
collected from the Office of the High Commissioner for more diversity in its written languages. Asia has the
Human Rights (OHCHR) web site and Language highest number of scripts (writing systems), character
Table 5 Example to show output of HTML character entities converter, based on three different types of HTML entities and each using different byte
order.
Char- Character Numeric Unicode UTF-8 Byte Order Output in Byte Sequences
acter Entity Character Code Point
References References Byte-1 Byte-2 Byte-3
& & & U+0026 0xxxxxxx U+0026
-> 00100110
-> 0x26
β β U+03B2 110yyyxx 10xxxxxx U+03B2
-> 1100111010110010
-> 0xCEB2
平 U+5E73 1110yyyy 10yyyyxx 10xxxxxx U+5E73
-> 111001011011100110110011
-> 0xE5B9B3
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No.1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 54
Table 6 Number of languages, scripts, encodings and user-defined font's information in training corpus A and B, sorted according to geographical region.
Language Script Encoding Font
Training Corpus A B AB A B AB A B AB A B AB
Africa 90 10 97 4 2 4 2 1 2 1 0 1
Asia 79 27 92 28 17 32 13 4 14 17 6 23
Caribbean 5 1 6 4 1 4 2 1 2 3 0 3
Central America 7 0 7 1 0 1 2 0 2 0 0 0
Europe 64 16 72 4 3 5 6 3 6 1 0 1
Int. Aux. Language(IAL) 3 1 4 1 1 1 3 1 3 0 0 0
Middle East 1 0 1 1 0 1 2 0 2 0 0 0
North America 20 1 21 2 1 2 2 1 2 1 0 1
Pacific Ocean 16 3 18 1 1 1 2 1 2 0 0 0
South America 47 0 47 1 0 1 2 0 2 0 0 3
Unique count 365 40 19 29
encoding schemes and user-defined fonts. Each of these pages it hosts. Thus, it would be redundant to collect more
factors makes language identification difficult. In the case than one page from the same web site. For each language,
of user-defined fonts, many of them do not comply with we collected a maximum of 20 web pages. Popular
international standards, hence making language languages like Arabic (ar), Chinese (zh), and English (en)
identification an even more challenging task. are easy to find, while less popular languages, like Fula
(ff), Limburgish (li), or Sanskrit (sa) are very difficult to
4.2 Validation Corpus find.
The validation corpus is comprised of texts from web The authors initial target was to cover all of the 185
pages. The authors predefined three primary sources to languages listed in ISO 369-1. However, three languages,
search for web pages in different languages. These sources namely Kanuri (kr), Luba-Katanga (lu) and South Ndebele
are Wikipedia, the iLoveLanguages gateway and online (nr) could not be found from the sources, nor by using
news/media portals. The source referred here is not search engines on the Web. As a result, the final validation
necessarily a single web site. For example, a web portal corpus used in the experiments contained 182 languages.
might contain, or link to, many web sites. Table 7 shows There are 1,660 web pages in the validation corpus,
more detailed information on each source. occupying 76,149,358 bytes of storage. The authors did
not normalize the size of collected web pages as the wide
The rule for selection is to collect one web page per web variation reflects the real situation on the Web.
site. The authors believe that in general a web site will
apply the same character encoding scheme to the web Each web page in the validation corpus has its filename in
Table 7 Information of defined Web's sources for collecting web pages for the validation corpus.
Web Site Validation corpus
No. of Pages Total Size (bytes) Min. (bytes) Max. (bytes)
Wikipedia 171 7,511,972 601 146,131
iLoveLanguages 103 790,934 3,634 18,445
BBC 34 396,292 2,990 61,190
China Radio 13 1,891,896 9,419 222,526
Deutsche Welle 14 1,164,620 5,957 87,907
The Voice of Russia 26 1,832,797 39,198 103,251
Voice of America 22 1,791,145 9,674 87,574
Kidon Media-Link & ABYZ News Links 1,277 60,769,702 135 1,048,314
Total 1,660 76,149,358
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 55
Table 9 List of files in Validation Corpus that are correctly identified in Experiment one, but wrongly identified in Experiment two.
File in VC Language Identification
Experiment one Experiment two
Language Script Encoding Language Script Encoding
bosnian-15-svevijesti.ba Bosnian Latin Latin2 Punjabi Gurmukhi UTF8
indonesian-11-watchtower Indonesian Latin UTF8 Aceh Latin Latin1
were 14 web pages that had been correctly identified Europe(10), International Auxiliary Language(2) and
before, but were wrongly identified in Experiment two due Middle East(1). As a result, the byte-sequence based
to the problem of over-training. Over-training problem HTML parser was introduced in Experiment three.
occurs when a language model is over-trained by a larger
training data size; and/or a newly trained language model By eliminating the steps of guessing and applying charset
affects the accuracy of other language models. Table 9 to text using charset returned by the charset detector, the
shows the list of files that affected by this problem. byte-sequence based parser was able to improve the
accuracy of language identification in Experiment three to
5.2 Evaluation on character and byte-sequence based 90.00%. All of the previously mentioned 50 web pages
HTML Parser were identified correctly in Experiment three.
During the HTML parsing stage of Experiment two, the 5.3 Evaluation on HTML Character Entities
language identification process detected 1,466 web pages Converter
with valid charset information and 191 web pages with
doubtful charset information. Of these 191 web pages, Experiment three miss-classified 166 web pages. Among
fourteen had "user-defined" charset and 177 were missing those, 76 web pages are caused by HTML character
charset information. entities problem. As a result, the HTML character entities
converter was introduced in Experiment four.
The character-sequence based HTML parser used in
Experiment one and two was defined to use UTF-8 The accuracy of language identification in Experiment
encoding to encode web page without valid charset four is 94.04%. The HTML character entities converter
information. When investigated on web pages without improved the algorithm by correctly identified 67 out of
valid charset information, it was found that the default the 76 (88.16%) HTML entities encoded web pages. There
UTF-8 character encoding scheme worked well on Latin- were 9 HTML entities encoded web pages not correctly
script based languages, but did not work well for 11 non- identified, where 3 of them were due to untrained legacy
Latin-script based languages: Amharic, Arabic, Armenian, font and the remaining 6 were miss identified to another
Belarusian, Bulgarian, Chinese, Greek, Hebrew, closely related language, like Amharic identified as
Macedonian, Russian and Ukrainian, respectively. Fifty Tigrinya, Assamese identified as Bengali, Persian
wrong classifications occurred after applied UTF-8 to the identified as Pashto, etc.
text extracted from web pages belonging to those
languages. Of those 50 pages, Africa(7), Asia(30),
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 57
A cube is a vector a1a2...an, where ai0,1,X and X is a Let si,sj 0,1 be the signals at i and j, respectively. Then
variable of the set 0,1 (i=1,2,,n). Hence, a cube is a the following lemmas hold:
set of vectors from 0,1n. Elements a1,a2,...,an are
coordinates of the cube. A cube has rank r, if it contains r Lemma 1. If cubes A and B satisfy relations i=si and j=sj
coordinates equal to X. A cube of rank r is called rcube. respectively, then the cube C=A B satisfies relation
(i=si) (j=sj).
A set of cubes is called 0 cover (1 cover) of line i if it
contains all input vectors generating signal of value 0 Proof. The proof follows from the fact that the cut of
(value 1) on this line. cubes is equivalent to the cut of sets of vectors represented
by these cubes.
Definition 1. The intersection of cubes A = a1a2...an and B
= b1b2...bn is the cube C = c1c2...cn, where ci = ai bi, i = Lemma 2. Let Si, Sj be sets of cubes satisfying i=si, j=sj,
1,2,...,n. The intersection operation is defined on the set respectively, then all cubes of the set Si Sj satisfy
0,1,X by Table 1. In Table 1 the symbol denotes relation (i=si) (j=sj), while all cubes of the set Si Sj
satisfy relation (i=si) (j=sj).
that the operation is not defined. The intersection of
cubes A and B is defined, if for any ai and bi the
Proof. The proof immediately follows from the definition
intersection operation is defined, i.e. ai bi . of the union and the cut of cube sets.
Table 1: Operation
0 1 X Let Su (0) and Su (1) be cube sets generating signal
i i
0 0 0 values 0 and 1 on the input lines ui, i=1,2,...,n of a logical
1 1 1 element, considered either separately or within a
combinational circuit. Based on Lemma 2 and properties
X 0 1 X
of logical elements, one can formulate the following
corollaries.
Definition 2. The cut of cube sets Q1 and Q2 is denoted
by Q1 Q2 and is the set of all cuts of a cube from Q1 Corollary 1. In the case of elements OR and NOR the cut
with a cube from Q2 . Su (0) Su (0)... Su (0) represents the set of cubes
1 2 n
Definition 3. The union of cube sets Q1 and Q2 is generating on the output line v signal value 0 for element
denoted by Q1 Q2. It contains all cubes from both Q1 OR, and signal value 1 for element NOR. The union
and Q2 . Su (1)Su (1)... Su (1) represents the set of cubes
1 2 n
generating on the output line v signal value 1 for element
Definition 4. A cube B = b1b2...bn is said to be a part of OR, and signal value 1 for element NOR.
the cube A = a1a2...an, if all vectors of B belong also to A.
Obviously, B is a part of A only if for any ai X we have Corollary 2. In the case of elements AND or NAND the
ai =bi. union Su (1)Su (1)... Su (1) represents the set of
1 2 n
Definition 5. If a cube B is a part of the cube A and if cubes generating on the output line v signal value 1 for
both cubes belong to the same set of cubes, then B can be element AND, and signal value 0 for element NAND. The
deleted from the considered set of cubes. This cut Su (0) Su (0)... Su (0) represents the set of cubes
1 2 n
modification is called cube absorption. In particular, we
say that A absorbs B. As noted, this is possible if for any ai generating on the output line v signal value 0 for element
X we have ai =bi. AND, and signal value 1 for element NAND.
0 or 1 cover of an arbitrary line of the combinational Signals f=1 and g=1 are defined on the left hand side of
circuit, which is an output line of an element of the second the relation. The following logical relations define
or higher level is determined by the following two steps: conditions for generating signals f=1 and g=1:
1. For an arbitrary line i, for which we want to determine (a 1) (b 1) (c 1) (f 1) (2)
a cover, we write logical relation defining conditions (d 1) (e 1) (g 1) (3)
for generating the signal of given value. This logical
relation is written on the basis of basic laws for the Since output lines a,b,c,e of the first level and input line d
considered element. The left hand side of the relation have appeared on the left hand side of the above relations,
determines signal values on all input lines of the we proceed to step 2.
element whose output line is the line i. For each line on 2. We determine distances for all lines appearing on the
the left hand side of the logical relation, we write a left hand side of logical relations obtained in step 1. Input
new logical relation defining conditions for generating lines of the circuit 1 9 have the gratest distance. Lines
a,b,c,d and e are at distance 2. Lines f and g are at distance
the signal of expected value. We keep writing logical
1.
relations until we come to relations on whose left hand
sides only input lines of the network or output lines of We construct the table Table 2.
the first level appear.
2. For each line in left hand sides of the relation Table 2: Line at distance 1
determined in step 1 we determine the distance in the
following way. Input lines of the combinational circuit
have the greatest distance r. For all lines at distance r1
we determine cube sets generating expected signal
values on these lines. Next, for all lines at distance r2
we determine cube sets generating expected signal
values on these lines using cube sets obtained for lines
at distance r1. We continue in this way until we get
the cover for line i.
Using the cut of cubes we have: Vectors 000, 001 and 010 represent a 1 cover of line 1(2).
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 63
implement a complex knowledge-intensive system. For The information gain calculates yield on Info of data set
such a complicated system program coding should be done before splitting and Info after choosing attribute with v
declaratively at a high abstraction level to alleviate the splits. The gain value of each candidate attribute is
burden of programmers and to ease reasoning about calculated, and then the maximum one has been chosen to
program semantics. be the decision node. The process of data partitioning
continues until the data subset has the same class label.
The rest of this paper is organized as follows. Section 2
provides some preliminaries on two major knowledge- Classification task based on decision-tree induction
mining tasks, i.e. classification and association mining. predicts the value of a target attribute or class, whereas
Section 3 proposes the medical expert system design association-mining task is a generalization of classification
framework with the knowledge-mining component. in that any attribute in the data set can be a target attribute.
Running examples on medical data set and the illustration Association mining is the discovery of frequently occurred
on knowledge deployment are presented in Section 4. relationships or correlations between attributes (or items)
Section 5 discusses related work and then conclusions are in a database. Association mining problem can be
drawn in Section 6. The implementation of knowledge- decomposed as (1) find all sets of items that are frequent
mining component is presented in the Appendix. patterns, (2) use the frequent patterns to generate rules. Let
I = {i1, i2, i3, ... , i m} be a set of m items and DB = { C1, C2,
C3, ..., C n} be a database of n cases and each case contains
2. Preliminaries on Tree-based Classification items in I.
and Association Mining
A pattern is a set of items that occur in a case. The number
Decision tree induction [21] is a popular method for of items in a pattern is called the length of the pattern. To
mining knowledge from medical data and representing the search for all valid patterns of length 1 up to m in large
result as a classifier tree. Popularity is due to the fact that database is computational expensive. For a set I of m
mining result in a form of decision tree is interpretability, different items, the search space of all distinct patterns can
which is more concern among medical practitioners than a be as huge as 2m-1. To reduce the size of the search space,
sophisticated method but lack of understandability. A the support measurement has been introduced [1]. The
decision tree is a hierarchical structure with each node function support(P) of a pattern P is defined as a number
contains decision attribute and node branches of cases in DB containing P. Thus,
corresponding to different attribute values of the decision support(P) = |{T | T DB, P T }|.
node. The goal of building decision tree is to partition data
with mixing classes down the tree until each leaf node A pattern P is called frequent pattern if the support value
contains data with pure class. of P is not less than a predefined minimum support
threshold minS. It is the minS constraint that helps
In order to build a decision tree, we need to choose the reducing the computational complexity of frequent pattern
best attribute that contributes the most towards generation. The minS metric has an anti-monotone
partitioning data to the purity groups. The metric to property such that if the pattern contains an item that is not
measure attributes ability to partition data into pure class frequent, then none of the patterns supersets are frequent.
is Info, which is the number of bits required to encode a This property helps reducing the search space of mining
data mixture. The metric Info of positive (p) and negative frequent patterns in algorithm Apriori [1]. In this paper we
(n) data mixture can be calculates as: adopt this algorithm as a basis for our implementation of
Info(P(p), P(n)) = P(p)log2P(p) P(n)log2P(n). association mining engine.
The symbols P(p) and P(n) are probabilities of positive
and negative data instances, respectively. The symbol p 3. Medical Expert System Framework and the
represents number of positive data instances, and n is the
negative cases. To choose the best attribute we have to Knowledge Mining Engines
calculate information gain, which is the yield we obtained
from choosing that attribute. The information gain 3.1 System Architecture
calculation of data with two classes (positive and negative)
is given as: Health information is normally distributive and
heterogeneous. Hence, we design the medical expert
Gain(Attribute) = Info{p/(p+n), n/(p+n)} system (Figure 1) to include data integration component at
i=1 to v {(pi+ni)/(p+n)} Info{ pi /( pi+ni), ni /( pi+ni) }. the top level to collect data from distributed databases and
also from documents in text format.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 66
The extensive use of these predicates contributes Data as shown are patient records suffering from allergy
significantly to program conciseness and the ease of (class=yes). There are ten patient records in this simple
program verification. The program produces frequent data set: patient IDs 2, 6, and 8 are those who are suffering
patterns as a set of co-occurring items. To generate a nice from allergy, whereas patient IDs 1, 3, 4, 5, 7, 9, 10 are
representation of association rule such as X => Y, the list suffering from other diseases but has shown some basic
L in the predicate association_mining has to be further symptoms similar to allergy patients. To induce
processed. classification model for allergy patients from this data, we
association_mining :- have to save this data set as a Prolog file (data.pl) and
min_support(V), % set minimum support
makeC1(C), % create candidate 1-itemset
include this file name at the header declaration of the main
makeL(C,L), % compute large itemset program. By calling predicate main, the system should
apriori_loop(L,1). % recursively run apriori respond as true. At this moment we can view the tree
model by calling listing(node), then listing(edge) and get
makeC1(Ans):-
input(D), % input data as a list
the following results.
allComb(1, ItemList, Ans2),
% make combination of itemset
1 ?- main.
maplist(countSS(D),Ans2,Ans). true.
% scan database and pass countSS 2 ?- listing(node).
% to maplist :- dynamic user:node/2.
makeC(N, ItemSet, Ans) :-
user:node(1, [2, 6, 8]-[1, 3, 4, 5, 7, 9, 10]).
input(D), allComb(2, ItemSet, Ans1), user:node(2, []-[1, 3, 5, 9, 10]).
maplist(flatten, Ans1, Ans2), user:node(3, [2, 6, 8]-[4, 7]).
maplist(list_to_ord_set, Ans2, Ans3), user:node(4, []-[4, 7]).
list_to_set(Ans3, Ans4),
include(len(N), Ans4, Ans5), % include is user:node(5, [2, 6, 8]-[]).
% also a higher-order predicate true.
maplist(countSS(D), Ans5, Ans). 3 ?- listing(edge).
% scan database to find: List+N :- dynamic user:edge/3.
user:edge(0, root-nil, 1).
user:edge(1, fever-yes, 2).
user:edge(1, fever-no, 3).
4. Running Examples and Knowledge user:edge(3, swollenGlands-yes, 4).
Deployment user:edge(3, swollenGlands-no, 5).
true.
To show the running examples of our program coding, we
use the following simple medical data represented as a The node and edge structures have the following formats:
Prolog file. node(nodeID, [Positive_Cases]-[Negative_Cases])
%% Data set: Allergy diagnosis edge(ParentNode, EdgeLabel, ChildNode)
% Symptoms of disease and their possible values
attribute( soreThroat, [yes, no]). The node structure is a tuple of nodeID and a mixture of
attribute( fever, [yes, no]). positive and negative cases represented as a list pattern:
attribute( swollenGlands, [yes, no]). [Positive_Cases]-[Negative_Cases]. Node 0 is a special
attribute( congestion, [yes, no]). node, representing root node of the tree. Node 1 contains a
attribute( headache, [yes, no]). mixture of ten patients, whereas node 5 is a pure group of
attribute( class, [yes, no]). allergy patients. The edges leading from node 1 to node 5
% Data instances capture the model of allergy patients. Therefore, the
instance(1, class=no, [soreThroat=yes, fever=yes, classification result represents the following data model:
swollenGlands=yes, congestion=yes, class(allergy) :- fever=no, swollenGlands=no.
headache=yes]).
instance(2, class=yes, [soreThroat=no, fever=no, This model is represented as a Horn clause, thus, it
swollenGlands=no, congestion=yes, provide flexibility of including this clause as a rule to
headache=yes]). select data in other group of patients who are suffering
instance(3, class=no, [soreThroat=yes, fever=yes, from throat infection. This kind of infection shows the
swollenGlands=no, congestion=yes, same basic symptoms as allergy; therefore, screening data
headache=no]). with the above rule can help focusing only on throat
infection cases.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 68
induced rules
Trigger Generation Mining
Component Component
aggregated data
Trigger rules Knowledge Data Fig. 5 A snapshot of medical expert system inductively created from the
allergy data set.
Fig. 2 The framework of knowledge deployment as triggers in a medical Knowledge Deployment: Example 2.
database. The induced knowledge once confirmed by the domain
expert can be added to the knowledge base of the expert
system shell. We illustrate the knowledge base that
automatically created from the induced tree in Figure 3.
This expert system shell has simple structure as
diagrammatically shown in Figure 4. User can interact
with the system through a line command as shown in
Figure 5, in which the user can ask for further explanation
by typing the why command.
5. Related Work
In recent years we have witnessed increasing number of
applications devising database technology and machine
learning techniques to mine knowledge from biomedicine,
Fig. 3 The content of automatically induced knowledge base. clinical and health data. Roddick et al [22] discussed the
two categories of mining techniques applied over medical
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 69
data: explanatory and exploratory. Explanatory mining The declarative style of our implementation also eases the
refers to techniques that are used for the purpose of future extension of the proposed medical support system
confirmation or making decisions. Exploratory mining is to cover the concepts of higher-order mining [23], i.e.
data investigation normally done at an early stage of data mining from the discovered knowledge, and constraint
analysis in which an exact mining objective has not yet mining [7], i.e. mining with some specified constraints to
been set. obtain relevant knowledge.
Our work is also in the main stream of medical decision The plausible extensions of our current work are to add
support system development, but our methodology is constraints into the knowledge mining method in order to
different from those appeared in the literature. The system limit the search space and therefore yield the most relevant
proposed in this paper is based on a logic-programming and timely knowledge, and due to the uniform
paradigm. The justification of our logic-based system is representation of Prologs statements as a clausal form,
that the closed form of Horn clauses that treats program in mining from the previously mined knowledge should be
the same way as data facilitates fusion of knowledge implemented naturally. We also plan to extend our system
learned from different sources, which is a normal setting to work with stream data that normally occur in modern
in medical domain. Knowledge reuse can easily practice in medical organizations.
this framework.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 70
init(AllAttr, [root-nil/PB-NB]) :- %
retractall(node(_, _)), % compute Info of each candidate node
retractall(current_node(_)), %
retractall(edge(_, _, _)), info(A, CurInstL, R, Split) :-
assert(current_node(0)), attribute(A,L),
findall(X, attribute(X, _), AllAttr1), maplist(concat3(A,=), L, L1),
delete(AllAttr1, class, AllAttr), suminfo(L1, CurInstL, R, Split).
findall(X2, instance(X2, class=yes, _), PB),
findall(X3, instance(X3, class=no, _), NB). concat3(A,B,C,R) :-
atom_concat(A,B,R1),
getNode(X) :- atom_concat(R1,C,R).
current_node(X), X1 is X+1,
retractall(current_node(_)), suminfo([],_,0,[]).
assert(current_node(X1)).
suminfo([H|T], CurInstL, R, [Split | ST]) :-
create_edge(_, _, []) :- !. AllBag = CurInstL, term_to_atom(H1, H),
findall(X1, (instance(X1, _, L1),
create_edge(_, [], _) :- !. member(X1, CurInstL),
member(H1, L1)), BagGro),
create_edge(N, AllAttr, EdgeList) :- findall(X2,(instance(X2, class=yes, L2),
create_nodes(N, AllAttr, EdgeList). member(X2, CurInstL),
member(H1, L2)), BagPos),
create_nodes(_, _, []) :- !. findall(X3,(instance(X3, class=no, L3),
member(X3, CurInstL),
create_nodes(_, [], _) :- !. member(H1, L3)), BagNeg),
(H11= H22) = H1,
create_nodes(N, AllAttr, [H1-H2/PB-NB|T]) :- length(AllBag, Nall),
getNode(N1), % get node sequence number N1 length(BagGro, NGro),
assert(edge(N, H1-H2, N1)), % H1-H2 is length(BagPos, NPos),
% a pattern length(BagNeg, NNeg),
assert(node(N1, PB-NB)), % PB-NB is Split = H11-H22/BagPos-BagNeg,
% a pattern suminfo(T, CurInstL, R1,ST),
append(PB, NB, AllInst), ( NPos is 0 *-> L1 = 0;
((PB \== [], NB \== []) -> % if-condition L1 is (log(NPos/NGro)/log(2)) ),
% then clauses ( 0 is NNeg *-> L2 = 0;
(cand_node(AllAttr, AllInst, AllSplit), L2 is (log(NNeg/NGro)/log(2)) ),
best_attribute(AllSplit, [V, MinAttr, ( NGro is 0 -> R= 999;
Split]), R is (NGro/Nall)*
delete(AllAttr, MinAttr, Attr2), (-(NPos/NGro)*
create_edge( N1, Attr2, Split)) L1- (NNeg/NGro)*L2)+R1).
; % else clause
true ),
create_nodes(N, AllAttr, T). /* ========================= */
% /* Association mining engine */
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 71
makeC1(Ans):-
input(D), % input data as a list,
% e.g. [[a], [a,b]] Acknowledgments
% then make combination of itemset
allComb(1, ItemList, Ans2), This work has been fully supported by research fund from
% scan database and pass countSS to maplist Suranaree University of Technology granted to the Data
maplist(countSS(D),Ans2,Ans).
Engineering and Knowledge Discovery (DEKD) research
makeC(N, ItemSet, Ans) :- input(D), unit. This research is also supported by grants from the
allComb(2, ItemSet, Ans1), National Research Council of Thailand (NRCT) and the
maplist(flatten, Ans1, Ans2), Thailand Research Fund (TRF).
maplist(list_to_ord_set, Ans2, Ans3),
list_to_set(Ans3, Ans4),
include(len(N), Ans4, Ans5), References
% include is also a [1] R. Agrawal, and R. Srikant, Fast algorithm for mining
% higher-order predicate association rules, in: Proc. VLDB, 1994, pp.487-499.
maplist(countSS(D), Ans5, Ans).
% scan database to find: List+N
[2] M. Alavi, and D.E. Leidner, Review: Knowledge
management and knowledge management systems:
makeL(C, Res):- % for all large itemset creation Conceptual foundations and research issues, MIS Quarterly,
% call higher-order predicates Vol.25, No.1, 2001, pp.107-136.
% include and maplist [3] Y. Bedard et al., Integrating GIS components with
include(filter, C, Ans), knowledge discovery technology for environmental health
maplist(head, Ans, Res). decision support, Int. J Medical Informatics, Vol.70, 2003,
%
pp.79-94.
% filter and head are for pattern matching of [4] C.C. Bojarczuk et al., A constrained-syntax genetic
% data format programming system for discovering classification rules:
% Application to medical data sets, Artificial Intelligence in
filter(_+N):- Medicine, Vol.30, 2004, pp.27-48.
input(D), [5] C. Bratsas et al., KnowBaSICS-M: An ontology-based
length(D,I), system for semantic management of medical problems and
min_support(V),
N>=(V/100)*I.
computerised algorithmic solutions, Computer Methods and
Programs in Biomedicine, Vol.83, 2007, pp.39-51.
head(H+_, H). [6] R. Correia et al., Borboleta: A mobile telehealth system for
primary homecare, in: Proc. ACM Symposium on Applied
% Computing, 2008, pp.1343-1347.
% an arbitrary subset of the set containing [7] L. De Raedt et al., Constraint programming for itemset
% given number of elements mining, in: Proc. KDD, 2008, pp.204-212.
%
comb(0, _, []).
[8] E. German et al., An architecture for linking medical
decision-support applications to clinical databases and its
comb(N, [X|T], [X|Comb]) :- evaluation, J. Biomedical Informatics, Vol.42, 2009,
N>0, N1 is N-1, pp.203-218.
comb(N1, T, Comb). [9] S. Ghazavi and T.W. Liao, Medical data mining by fuzzy
modeling with selected features, Artificial Intelligence in
comb(N, [_|T], Comb) :- Medicine, Vol.43, No.3, 2008, pp.195-206.
N>0,
comb(N, T, Comb).
[10] D. Hristovski et al., Using literature-based discovery to
allComb(N, I, Ans) :- identify disease candidate genes, Int. J Medical Informatics,
setof(L, comb(N, I, L), Ans). Vol.74, 2005, pp.289-298.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 72
[11] M.J. Huang et al., Integrating data mining with case-based [30] X. Zhou et al., Text mining for clinical Chinese herbal
reasoning for chronic diseases prognosis and diagnosis, medical knowledge discovery, in: Proc. 8th Int. Conf. on
Expert Systems with Applications, Vol.32, 2007, pp.856-867. Discovery Science, 2005, pp.396-398.
[12] N.C. Hulse et al., Towards an on-demand peer feedback [31] Z.Y. Zhuang et al., Combining data mining and case-based
system for a clinical knowledge base: A case study with reasoning for intelligent decision support for pathology
order sets, J Biomedical Informatics, Vol.41, 2008, pp.152- ordering by general practitioners, European J Operational
164. Research, Vol.195, No.3, 2009, pp.662-675.
[13] N.K. Kakabadse et al., From tacit knowledge to knowledge
management: Leveraging invisible assets, Knowledge and
Process Management, Vol. 8, No. 3, 2001, pp.137-154. Nittaya Kerdprasop is an associate professor at the school of
[14] E. Kretschmann et al., Automatic rule generation for computer engineering, Suranaree University of Technology,
Thailand. She received her B.S. in radiation techniques from
protein annotation with the C4.5 data mining algorithm
Mahidol University, Thailand, in 1985, M.S. in computer science
applied on SWISS-PROT, Bioinformatics, Vol.17, No.10, from the Prince of Songkla University, Thailand, in 1991 and Ph.D.
2001, pp.920-926. in computer science from Nova Southeastern University, USA, in
[15] P.-J. Kwon et al., A study on the web-based intelligent 1999. She is a member of IAENG, ACM, and IEEE Computer
self-diagnosis medical system, Advances in Engineering Society. Her research of interest includes Knowledge Discovery in
Software, Vol.40, 2009, pp.402-406. Databases, Data Mining, Artificial Intelligence, Logic and
Constraint Programming, Deductive and Active Databases.
[16] C. Lin et al., A decision support system for improving
doctors prescribing behavior, Expert Systems with Kittisak Kerdprasop is an associate professor and the director of
Applications, Vol.36, 2009, pp.7975-7984. DEKD (Data Engineering and Knowledge Discovery) research unit
[17] E. Mugambi et al., Polynomial-fuzzy decision tree at the school of computer engineering, Suranaree University of
structures for classifying medical data, Knowledge-Based Technology, Thailand. He received his bachelor degree in
System, Vol.17, No.2-4, 2004, pp.81-87. Mathematics from Srinakarinwirot University, Thailand, in 1986,
master degree in computer science from the Prince of Songkla
[18] G. Nadathur, and D. Miller, Higher-order Horn clauses, J University, Thailand, in 1991 and doctoral degree in computer
ACM, Vol.37, 1990, pp.777-814. science from Nova Southeastern University, USA, in 1999. His
[19] D. Nguyen et al., Knowledge visualization in hepatitis current research includes Data mining, Machine Learning, Artificial
study, in: Proc. Asia-Pacific Symposium on Information Intelligence, Logic and Functional Programming, Probabilistic
Visualization, 2006, pp.59-62. Databases and Knowledge Bases.
[20] S. Palaniappan, and C.S. Ling, Clinical decision support
using OLAP with data mining, Int. J Computer Science and
Network Security, Vol.8, No.9, 2008, pp.290-296.
[21] J.R. Quinlan, Induction of decision trees, Machine
Learning, Vol.1, 1986, pp.81-106.
[22] J.F. Roddick et al., Exploratory medical knowledge
discovery: experiences and issues, ACM SIGKDD
Explorations Newsletter, Vol.5, No.1, 2003, pp.94-99.
[23] J.F. Roddick et al., Higher order mining, ACM SIGKDD
Explorations Newsletter, Vol.10, No.1, 2008, pp.5-17.
[24] C.P. Ruppel, and S.J. Harrington, Sharing knowledge
through intranets: A study of organizational culture and
intranet implementation, IEEE Transactions on
Professional Communication, Vol.44, No.1, 2001, pp.37-51.
[25] T.R. Sahama, and P.R. Croll, A data warehouse
architecture for clinical data warehousing, in: Proc. 12th
Australasian Symposium on ACSW Frontiers, 2007, pp.227-
232.
[26] A. Satyadas et al., Knowledge management tutorial: An
editorial overview, IEEE Transactions on Systems, Man and
Cybernetics, Part C, Vol.31, No.4, 2001, pp.429-437.
[27] A. Shillabeer, and J.F. Roddick, Establishing a lineage for
medical knowledge discovery, in: Proc. 6th Australasian
Conf. on Data Mining and Analytics, 2007, pp.29-37.
[28] J. Thongkam et al., Breast cancer survivability via
AdaBoost algorithms, in: Proc. 2nd Australasian Workshop
on Health Data and Knowledge Management, 2008, pp.55-
64.
[29] N. Uramoto et al., A text-mining system for knowledge
discovery from biomedical documents, IBM Systems J,
Vol.43, No.3, 2004, pp.516-533.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 73
Abstract
2. Preliminaries
A proxy blind signature scheme is a special form of blind
signature which allows a designated person called proxy 2.1 Notations
signer to sign on behalf of two or more original signers
without knowing the content of the message or document. Common notations used in this paper as follows:
It combines the advantages of proxy signature, blind
signature and multi-signature scheme and satisfies the p : The order of underlying finite field.
security properties of both proxy and blind signature
Fp : the underlying finite field of order p
scheme. Most of the exiting proxy blind signature schemes
were developed based on the mathematical hard problems E: elliptic curve defined on finite field Fp with
integer factorization (IFP) and simple discrete logarithm
large order.
(DLP) which take sub-exponential time to solve. This
G: the group of elliptic curve points on E.
paper describes an secure simple proxy blind signature
scheme based on Elliptic Curve Discrete Logarithm P: a point in E ( Fp ) with order n, where n is a
Problem (ECDLP) takes fully-exponential time. This can large prime number.
be implemented in low power and small processor mobile H (.): a secure one-way hash function.
devices such as smart card, PDA etc. Here also we d: the secret key of the original signer S to be
describes implementation issues of various scalar chosen randomly from [1, n - 1].
multiplication for ECDLP Q is the public key of the original signer S,
where Q = d. Q.
Keywords: ECDLP, IFP, blind signature, proxy signature. k: Concatenation operation between two bit
1. Introduction stings.
Blind signature scheme was first introduced by Chaum 3. Backgrounds
[2]. It is a protocol for obtaining a signature from a signer, In this section we brief overview of prime field, Elliptic
but the signer can neither learn the messages nor the Curve over that field and Elliptic Curve Discrete
signatures. The recipients obtain afterwards. In 1996, Logarithm Problem.
mamo et al proposed the concept of proxy signature [1]. In
proxy signature scheme, the original signer delegates his 3.1 The finite field Fp
signing capacity to a proxy signer who can sign a message
submitted on behalf of the original signer. A verifier can Let p be a prime number. The finite field Fp is comprised
validate its correctness and can distinguish between a
of the set of integers 0, 1, 2. p-1 with the following
normal signature and a proxy signature. A proxy blind
arithmetic operations [4] [5] [6]:
signature scheme is a digital signature scheme that ensures
the properties of proxy signature and blind signature. In a
proxy blind signature, an original signer delegates his
signing capacity to proxy signer.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 74
Addition: If a, b Fp , then a + b = r, where r is find the integer l [0, n-1] such that Q = lP. The integer l
is called discrete logarithm of Q to base P, denoted l =
the remainder when a + b is divided by p and 0
r p- 1. This is known as addition modulo p. log p Q [8].
Multiplication: If a, b Fp , then a.b = s, where s
is the remainder when a.b is divided by p and 0 4. Proxy Signatures and Proxy Blind
s p-1. This is known as multiplication Signature
modulo p.
Inversion: If a is a non-zero element in Fp , the A proxy blind signature is a digital signature scheme that
inverse of a modulo p, denoted a1, is the unique ensures the properties of proxy signature and blind
signature schemes. Proxy blind signature scheme is an
integer c Fp for which a.c = 1. extension of proxy blind signature, which allows a single
designated proxy signer to generate a blind signature on
3.2 Elliptic Curve over Fp behalf of group of original signers. A proxy blind
signature scheme consists of the following three
Let p 3 be a prime number. Let a, b Fp be such that phases[9]:
4a 27b 0 in Fp . An elliptic curve E over Fp
3 2
Proxy key generation
defined by the parameters a and b is the set of all solutions Proxy blind multi-signature scheme
(x, y) , x, y Fp , to the equation y x ax b ,
2 3
Signature verification
together with an extra point O, the point at infinity. The
set of points E ( Fp ) forms an abelian group with the 5. Security properties
following addition rules [8]: The security properties for a secure blind multi-signature
scheme are as follows [9]
1. Identity : P + O = O + P = P, for all P E ( Fp )
Distinguishability: The proxy blind multi-
2. Negative: if P(x; y) E ( Fp ) then (x, y) + (x,-y) signature must be distinguishable from the
= O, The point (x,-y) is dented as -P called ordinary signature.
negative of P. Strong unforgeability: Only the designated
3. Point addition: Let P( x1 , y1 ) , Q ( x2 , y2 ) proxy signer can create the proxy blind signature
for the original signer.
E ( Fp ) , then P + Q = R E ( Fp ) and Non-repudiation: The proxy signer can not
coordinate ( x3 , y3 ) of R is given by claim that the proxy signer is disputed or illegally
signed by the original signer.
x3 2 x1 x2 and y3 ( x1 x3 ) y1 . Verifiability: The proxy blind multi-signature
Where 2
( y y1 ) can be verified by everyone. After verification,
the verifier can be convinced of the original
( x2 x1 )
signer's agreement on the signed message.
4. Point doubling: Let P( x1 , y1 ) E(K) where P Strong undeniably: Due to fact that the
-P then 2P = (x3; y3) delegation information is signed by the original
signer and the proxy signature are generated by
x3 (3 x1 a
2
where ) 2 2 x1 and the proxy signer's secret key. Both the signer can
2 y1 not deny their behavior.
Unlinkability: When the signer is revealed, the
y3 (3 x1 a )
2
( x1 x3 ) y1 . proxy signer can not identify the association
2 y1
between the message and the blind signature he
3.3 Elliptic Curve Discrete Logarithm generated.
Problem (ECDLP) Secret key dependencies: Proxy key or
delegation pair can be computed only by the
Given an elliptic curve E defined over a finite field Fp ,a original signer's secret key.
Prevention of misuse : The proxy signer cannot
point P E ( Fp ) of order n, and a point Q < P >, use the proxy secret key for purposes other than
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 75
Proxy Verification: After receiving the secret Theorem 2 Proxy signature is distinguishable from
key pairs (s, r), the proxy signer Ps checks the original signer's normal signature.
validity of the secret key pairs (s, r) with the
following equation. Proof: Since proxy key is different from original signer's
private key and proxy keys created by different proxy
signers are different from each other, any proxy signature
Q p s.P Q r.R (1)
is distinguishable from original signer's normal signature
and different proxy signer's signature are distinguishable.
6.2 Signing Phase
The Proxy signer Ps chooses random integer t Theorem 3 The scheme satisfies Unlinkability security
requirement.
[1, n- 1] and computes U = t .P and sends it to the
verifier V.
Proof: In verification stage, the signer checks only
After receiving the verifier chooses
randomly , [1 n-1] and computes the whether H (( s p .P e~.Q p ) M ) holds.
following He does not know the original signer's private key and
proxy signer's private key. Thus the signer knows neither
~ the message nor the signature associated with the signature
R U .P .Q p (2) scheme.
~
e~ H ( R M ) (3) 8. Correctness
e (e~ ) mod n (4) Theorem 4 The proxy blind signature (M , s p , ~
e ) is
and verifier V sends e to the proxy signer Ps . universally verifiable by using the system Public
After receiving e, Ps computes the following parameters.
~
s (t s.e) mod n (5) Proof: The proof of correctness of the signature is verified
and sends it to V . as follows. We have to prove that
Now V computes
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 76
~
H (( s p .P e~.Q p ) M ) H ( R M ) i.e. to show The parameter t is called trace of E over Fp . An
interesting fact is that given any integer, there exists an
s p .P e~.Q p R
~ elliptic curve E over Fp such that # E ( F p ) q 1 t .
(~ s ).P e~.Q p
10. Point Representation and Cost of Group
Operations
~
s .P .P e~.Q p
(t s.e).P .P e~.Q p Point addition and point doubling are two important
operations in ECC. Inversion in a finite field is an
t.P (e~ ).Q .P e~.Q
p p
expensive operation. To avoid these inversions, several
point representations have been proposed in literature. The
t.P .Q p .P cost of point addition and doubling varies depending upon
U .Q p .P the representation of the group elements. In the current
section, we will briefly deal with some point
~
R representations commonly used. Let [i], [m], [s], [a] stand
for cost of a field element inversion, a multiplication, a
9. Implmentation Issues squaring and an addition respectively. Field element
addition is considered to be a very cheap operation. In
In this section we have discussed implementation issues, binary fields, squaring is also quite cheaper than a
i:e. efficiency and size of the hard-ware. The basic multiplication. If the underlying field is represented in
operation for Cryptographic Protocols based on ECDLP; it normal basis then squaring is almost for free. Inversion is
is easily performed via repeated group operation. One can considered to be 8 to 10 times costlier than a
visualize these operations in a hierarchical structure. Point multiplication in binary fields. In prime field the I/M ratio
multiplication is at top level. At the next lower level is the is even more. It is reported to be between 30 and 40 .
point operations, which are closely related to coordinates
used to represent the points. The lowest level consists of 10.1 Elliptic Curves
finite field operations such as addition, subtraction,
multiplication and inversion. Point representation in ECC is a well studied area. In the
following two sections we describe some of the point
9.1 Group Order representation popularly used in implementations. Table 1.
Cost of Group Operations in ECC for Various Point
The order of the elliptic curve group over the underlying Representations for Characteristic > 3
field is an important security parameter. There are attacks
(for example Pohlig-Hellman attack) which can be Coordinates Cost Coordinates Cost
launched on ECC if the group order is not divisible by a (Addition) (Doubling)
very large prime. In fact the Pohlig-Hellman attack
A A A 1[i] 2[m] 1[s] 2A A 1[i] 2[m] 2[s]
dictates that the group order for ECC should be product of
a large prime multiplied by a small positive integer less PPP 12[m] 2[s] 2P P 7[m] 3[s]
than 4. This small number is called cofactor of the curve.
Various algorithms have been proposed in literature (for
JJ J 12[m] 4[s] 2J J 6[m] 4[s]
example Kedlaya's algorithm for ECC and Schoof's C C C 11[m] 3[s] 2C C 5[m] 4[s]
algorithm for ECC) for efficiently counting the group
order. The group order of an elliptic curve is given by Fields of Characteristic > 3 Elliptic curves over fields of
Hasse's theorem. characteristic > 3 have equations of the form
Theorem 5. Let E be an elliptic curve over a finite field y 2 x 3 ax b. For such curves the following point
Fp of order q. Then the order # E ( Fp ) of the elliptic representation methods are mostly used.
multiplication methods (i.e. methods to compute (lP + The scalar multiplication is the dominant operation in
mQ). Also, due to the vastness of the subject and space ECC. Extensive research has been carried out to compute
constraints we will elaborate only those methods which it efficiently and a lot of results have been reported in
are discussed in depth in this dissertation. The basic literature. To compute the scalar multiplication efficiently
algorithms to compute the scalar multiplication are the age there are three main approaches. As is seen in the basic
old binary algorithms. They are believed to have been binary algorithms the efficiency is intimately connected to
known to the Egyptians two thousand years ago. The two the efficiency of ADD and DBL algorithms. So the first
versions of DBL-AND-ADD algorithm are defined above. approach is to compute group operations efficiently. The
These algorithms invoke two functions ADD and DBL. second approach is to use a representation of the scalar
ADD takes as input two points X 1 and X 2 and return such that the number of invocation of group operation is
reduced. The third approach is to use more hardware
their sum X 1 X 2 , DBL takes as input one point X and support (like memory for pre-computation) to compute it
computes its double 2X. efficiently. In some proposals these have approaches have
_______________________________________________ been successfully combined to yield very efficient
Algorithm DBL-AND-ADD (Left-to-right binary method) algorithms. As noted in the above, the cost of ADD and
_______________________________________________ DBL depend to a large extent on the choice of underlying
Input: X, m ( mk 1....m1 , m0 ) field and the point representation. Hence the cost of scalar
multiplication also depends upon these choices. Based on
Output: mX.
the underlying field more efficient operations have been
1. E = mk 1 X proposed. Over binary fields for ECC, using a point
2. for i = k-2 down 0 halving algorithm instead of DBL has been proved to be
3. E = DBL(E) very efficient. Over fields of characteristic 3, point tripling
4. if mi 1 has been more efficient. There are proposals for using
fancier algorithms like the ones efficiently computing 2P
5. E = ADD(E, X)
+ Q, 3P + Q etc. instead of ADD and DBL.
6. return E
_______________________________________________
_______________________________________________ 12. Conclusions
Algorithm DBL-AND-ADD (Right-to-left binary method)
_______________________________________________ The security of the scheme is hardness of solving ECDLP.
Input : X, m , ( mk 1....m1 , m0 ) The primary reason for the attractiveness of ECC over
Output : mX. systems such as RSA and DSA is that the best algorithm
known for solving the underlying mathematical problem
namely, the ECDLP takes fully exponential time. In
1. E0 X , E1 0 contrast, sub-exponential time algorithms are known for
2. for i = 0 to k-1 underlying mathematical problems on which RSA and
3. if mi 1 DSA are based, namely the integer factorization (IFP) and
the discrete logarithm (DLP) problems. This means that
4. E1 ADD( E0 , E1 ) the algorithms for solving the ECDLP become infeasible
5. E0 DBL( E0 ) much more rapidly as the problem size increases more
than those algorithms for the IFP and DLP. For this
6. return ( E1 ) reason, ECC offers security equivalent to RSA and DSA
while using far smaller key sizes. The benefits of this
Both the algorithms first convert the scalar multiplier m higher-strength per-bit include higher speeds, lower power
into binary. Suppose m has a n-bit representation with consumption, bandwidth savings, storage efficiencies, and
hamming weight h. Then, mX can be computed by n-1 smaller certificates. This can be implemented in low
invocations of DBL and h - 1 invocations of ADD. Hence power and small processor mobile devices such as smart
cost of the scalar multiplication is card, PDA etc. In this proposed scheme it is infeasible for
(n 1) cos t ( DBL) h cos t ( ADD) .As the adversary to derive signer's private key from all available
average value of h is n=2, on the average these algorithms public information. This protocol also achieves the
require (n - 1) doubling and n=2 additions. As doublings security like requirements distinguishability, strong
are required more often than additions, attempts are made unforgeability, non-repudiation, and unlinkability.
to reduce complexity of the doubling operation.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 79
2
Laboratoire SIC et Equipe IRMA, University of Poitiers
Poitiers, France
Abstract
The use of the web in languages learning has been developed at 2. Language Skills and Writing
very high speed these last years. Thus, we are witnessing many
research and development projects set in universities and In order to understand the problem being considered in
distance learning programs. However, the interest in research this article, it is of primary importance to know what are
related to writing competence remains relatively low. the capacities concerned during a learning process of a
Our proposed research examines the use of the web for studying foreign language. We point out that the capacities in
English as a second foreign language at an Algerian university.
learning a language represent the various mental
One focus is on pedagogy: therefore, a major part of our research
is on developing, evaluating, and analyzing writing operations that have to be done by a listener, a reader, or a
comprehension activities, and then composing activities into a writer in an unconscious way, for example: to locate,
curriculum. discriminate or process the data. One distinguishes in the
The article starts with the presentation of language skills and analytical diagram, basic capacities which correspond to
reading comprehension. It then presents our approach of the use linguistic activities and competence in communication that
of the web for learning English as a second language. Finally a involve more complex capacities.
learner evaluation methodology is presented. The article ends
with the conclusion and future trends. 2.1 Basic Language Skills
Keywords: Reading comprehension, E-learning, Assessment,
Online Platform, Paper Submission. The use of a language is based on four skills. Two of these
skills are from comprehension domain. These are oral and
written comprehension. The last two concern the oral and
1. Introduction written expression (see Table 1). A methodology can give
the priority to one or two of these competences or it can
This article describes a web based approach, where the
aim at the teaching/learning of these four competences
web is used for educational activities. The main focus of
together or according to a given planned program.
this article is on reading comprehension of foreign
language.
On one hand, the written expression paradoxically is the
component in which the learner is evaluated more often. It
A new approach on the use of the web technology and
is concerned with the most demanding phase of the
how it was used in language learning, especially writing,
learning by requiring an in depth knowledge of different
is presented.
capacities (spelling, grammatical, graphic, etc.). On the
other hand, listening comprehension corresponds to the
One of the main goals in our research work is to explore
most frequent used competence and can be summarized in
what are the best web learning practices and activities are
the formula "to hear and deduce a meaning".
in terms of assisting and supporting learning to become a
Chronologically, it is always the one that is confronted
more meaningful process. Another goal is to explore from
first, except in exceptional situations (people only or
a pedagogical perspective the innovative future learning
initially confronted with the writing, defective in hearing,
practices, which are related to the new forms of studying.
study of a dead language (a language that is not in use any
more), study of a language on the basis of the autodidact
writing).
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 81
3. Related work
There has been much work on online reading focusing on
new interaction techniques [1] and [2] to support practices
observed in fluent readers [3] and [4], such as annotation,
clipping, skimming, fast navigation, and obtaining
overviews. Some work has studied the effect of
presentation changes like hypertext appearance [5] on
reading speed and comprehension.
The findings support what other studies have found in Fig. 1 Basic architecture of the environment
terms of positive influence of online environment on
students performances [6], [7], [8] and [9], but cannot be
a substitution for them. The characteristics of online
environment can increase students motivation, create
highly interactive learning environments, provide a variety
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 82
By being integrated with the environment in which much Features which are unique to the system and which would
of the learners activity takes place, physical time and enhance the learning include:
effort barriers can also be reduced, made even lower by
automated logging of basic documents and events (emails, The ability to easily sequence activities into re-usable
documents, diary entries, etc.) [14]. Finally, a statistical lesson plans (using a simple visual drag and drop
analysis of logbooks of a group of learners that have done lesson planner).
the same activity would give a synthetic vision of the Recording of learner responses for later review by
groups learning, and would be useful to all people learners/teachers and the option for teachers to create
involved in the learning. question & answer activities with either anonymous
or identified answers from learners (which provides a
3.5 Communication and collaboration using mobile basis for more honest answers due to the lack of peer
devices pressure).
In the environment, learners have to find the same Informal learning scenarios (such as student discussion in
classical environment as they have in real life. In this a cafeteria) provide environments where mobile devices
environment learners can ask for all questions whenever can support flexible, on the fly learning opportunities.
they need and they discuss a lot together of interesting or Valuable learning activities in these contexts could be
pointless subjects. supported by a content sharing tool, and discussion forums
and live chat/instant messaging for questions and
The environment also support group communications by responses to other learners or the teacher.
offering discussions, forums and shared workspaces where
learners can exchange documents using podcasting tool. Again, the environment provides unique features to
We distinguish between asynchronous and synchronous support these activities by providing an environment to
communication facilities. Social contacts are a crucial manage and deliver these tools in the context of
point in learning situations. Learners should therefore be asynchronous (and synchronous) informal learning,
able to present themselves in a personal homepage with a including recording of activities for later learner/teacher
photograph, a list of hobbies and other personal aspects. review, and creation of re-usable lesson plans (based
Such personal presentations are not toys, but they can help around informal student learning using flexible toolsets).
the learners to get into contact even more easily than in
live classroom situations. There is a great potential in
using mobile terminals for communication services. 4. Experimentation
The communication and collaboration system launches a In Algeria, we evaluate the reading ability of students
variety of communications options including text, audio, university by giving them reading comprehension tests.
video, and whiteboard and provides for 1-to-many, many- These tests typically consist of a short text followed by
to-1, and 1-to-1 communication. It provides a powerful questions. Presumably, the tests are designed so that the
architecture for the development of new educational tools reader must understand important aspects of the text to
to enhance different modes of teaching and learning. It is answer the questions correctly. For this reason, we believe
ideally suited to mobile learning and able to integrate tools that reading comprehension tests can be a valuable tool to
developed explicitly for mobile contexts. The opportunity assess the state of the art in natural language
is to leverage the platform to develop innovative tools that understanding.
are applicable to (1) synchronous formal learning (e.g.,
classrooms) and (2) asynchronous informal learning (e.g., The main hypothesis of the present research study is as
discussion in the cafeteria). follow: the ongoing integration and utilization of the
computer within the English language reading
There are a number of learning activities in formal comprehension will firstly enhance the learners affect
educational environments (such as teacher-led classroom exemplified by high motivations, and self-confidence.
scenarios), which are ideally suited to mobile learning Consequently, when learners are motivated, they learn
tools. Synchronous learning activities such as better and acquire more knowledge.
polling/voting and question and answer (where the system Secondly, empower the English teachers roles and
immediately collates all responses and presents an responsibilities as active agents of pedagogical and
aggregate view of votes or answers to all learners) are technological change.
ideal for pedagogically rich learning. The main objectives of the current work are to investigate,
firstly, the validity of computer-assisted comprehension
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 84
reading and secondly, to attract both teachers and learners 4.2 Procedure
attention as to the crucial relevance of the urgent
integration the computer in English language learning in Every student participated in two sessions separated by 2
Algerian university. This study was conducted in intranet- weeks. In the first session, the Computer text was used;
based English language classroom with student of fifth in the second session, the Network text was used. In
year preparing the engineering degree of Computer each session, the students of one group received the
Science Department in the Faculty of Engineering of the computer condition and other the paper condition.
University of Batna, Algeria. Therefore, any obtained Thus, every student was exposed to the two contents
conclusions or results will apply of them. (Computer and network), each content in one of two
processing conditions (paper and laptop). In the first
There is a myriad of appropriate methodologies for the session, one group of the students used the paper sheets
study of different learning problems. The selection of one like support of work (reading and answering) for the
and the avoidance of other is not a simple task at all. The Computer text; the other one used the laptops as work
nature and purpose of the investigation and the population support for the Network text. In the second session,
involved will help the research to which method to be those students who had received the laptop condition in
dealt with. In our present research work which investigates the first session received the paper condition for the
the possibility to adopt and adapt the computer in English Computer text, and those that had received the paper
language as instructional means and the way it can affect condition in the first session received the laptop
positively the learners, we found it more convenient to opt condition for the Network text. The information
for the experimental research methods. concerning every session are summarized in the table 2.
The Reading comprehension has come to be recognized as The test condition involved the following instructions:
an active rather than a passive skill and its importance
acknowledged in the acquisition of language. With the Read the following text. You have fifty minutes for this
emergence of multimedia as teaching tools, it is being task. The conditions were explained to students who asked
given renewed attention. for clarification. The set of multiple-choice questions was
distributed to the subjects with the text on their desks.
To verify if comprehension is reached, the learners are
invited to answer to short instructions written in English After 5O min, all the materials were collected. The
language, without required that they write them in the students has a good or a very good knowledge of
sentences forms. The tasks of comprehension credited on computing and didn't know at all or a few about the
the marks-scale, which appears on the specific grid, principle application (two persons out of five knew a little
provide for each support and are distributed to the learners its principle of working).
[15].
The choice deliberated of this kind of people was
4.1 Material conclusive because, contrary to beginners, they proved to
be cooperative, and looked for testing the system, what
Text has been used in an exploratory study with similar helped us to identify the limits and weakness of this first
students, the findings of which showed the texts as version of the application.
suitable in terms of content and level.
In our project, we proceeded to the experimentation of the
The text is general enough to be understood by students understanding of English language by using our developed
and do not require a deep specialists knowledge of the system. In other words, we submit a text in English
topic discussed. language to read, followed by exercises of Multiple
Choice Question (MCQ) and True/False type on sheet of
A set of multiple-choice comprehension questions was paper (classical method) for a group of users, and on a
prepared for the text. All the questions were conceptual, laptop for another group of users. The text to read and the
focusing on main ideas and purpose of the writer, exercises are elaborated by a specialist teacher at the
organization of the text. The multiple-choice format was department of English language of Batna University.
chosen, despite much of the criticism on this method of
testing, after a preliminary study with the open-ended The set of the proposed exercises are marked on 20 points.
format of questions which yielded too many divergent
responses. Our population is constituted of 20 students 4th year
computing engineer distributed in two groups:
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 85
10 students participate in this experimentation on helped us to identify the limits and weakness of this first
sheets of paper. version of the application.
10 students participate in this experimentation on
laptop. In our project, we proceeded to the experimentation of the
understanding of English language by using our developed
The interest of this experimentation is to answer the system. In other words, we submit a text in English
following question: Does the use of sheet of paper in language to read, followed by exercises of MCQ and
written comprehension is more efficient than the use of the True/False type on sheet of paper (classical method) for a
laptop (H0 hypothesis)? To answer this question (H0 group of users, and on a laptop for another group of users.
hypothesis), Fisher statistical method is adopted. The text to read and the exercises are elaborated by a
specialist teacher at the department of English language of
Among the 10 students, 5 students work in an individual Batna University. The set of the proposed exercises are
way, two groups (formed of 2 and 3 students) work marked on 20 points.
together, i.e. they collaborate to read and to understand the
text and solve the proposed exercises together. The same Our population is constituted of 20 students 4th year
thing is made for the experimentation on sheets of paper, computing engineer distributed in two groups:
but in that case, the students find the text and the exercises
on a laptop and are marked in an automatic way. 10 students participate in this experimentation on sheets of
paper.
Every student participated in two sessions separated by 2 10 students participate in this experimentation on laptop.
weeks. In the first session, the Computer text was used;
in the second session, the Network text was used. In The interest of this experimentation is to answer the
each session, the students of one group received the following question: Does the use of sheet of paper in
computer condition and other the paper condition. written comprehension is more efficient than the use of the
Thus, every student was exposed to the two contents laptop (H0 hypothesis)? To answer this question (H0
(Computer and network), each content in one of two hypothesis), Fisher method is adopted [16].
processing conditions (paper and laptop). In the first
session, one group of the students used the paper sheets Among the 10 students, 5 students work in an individual
like support of work (reading and answering) for the way, two groups (formed of 2 and 3 students) work
Computer text; the other one used the laptops as work together, i.e. they collaborate to read and to understand the
support for the Network text. text and solve the proposed exercises together. The same
thing is made for the experimentation on sheets of paper,
In the second session, those students who had received the but in that case, the students find the text and the exercises
laptop condition in the first session received the paper on a laptop and are marked in an automatic way.
condition for the Computer text, and those that had
received the paper condition in the first session received Table 2: The groups of work
the laptop condition for the Network text. The Group 1 5 students working separately on laptop
information concerning every session are summarized in Group 2 5 students working separately on sheet of paper
the table 2. 2 groups of students (2 or 3) working in
Group 1.s collaboration on laptop
The test condition involved the following instructions: 2 groups of students working in collaboration on
Group 2.s
Read the following text. You have fifty minutes for this sheet of paper
task. The conditions were explained to students who asked
for clarification. The set of multiple-choice questions was
distributed to the subjects with the text on their desks. 4.3 Statistic study
After 5O min, all the materials were collected. The
students has a good or a very good knowledge of Our main objective is to try to answer the following
computing and didn't know at all or a few about the question: Is the traditional use of paper sheets as work
principle application (two persons out of five knew a little support in the reading comprehension more effective than
its principle of working). the use of the laptop concerning this population
(assumption H0) ?
The choice deliberated of this kind of people was
conclusive because, contrary to beginners, they proved to By applying the Fisher method, one calculates the sum of
be cooperative, and looked for testing the system, what square method, SS meth, and the sum of the residual,
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 86
SSres to reach the factor Fisher F. The results are available the experiment will be limited to the targeted population
in figure 3 with: Degree of freedom = 3 and the critical only.
point of Fisher F3,16 (0.05) = 3.23:
From the obtained results we have F > F3,16 (0,05),
therefore, one rejects H0 i.e. the use of laptop is more 5. Future Trends
effective than the use of traditional paper sheets.
We have started experimenting with the use of the
One can note starting from the Fisher's result that the use environment in real teaching/learning situation. This
of microcomputer by our learners helped us obtain a experimentation allows us to collect information on the
higher performance than working on the traditional effective activities of the users. We can thus validate or
paper sheet and we noted the collaborative learning with question certain technical choices and determine with
a help of a micro portable provided us with the better more precision the adaptations that have to be made to the
performances. integrated tools Feedback from a panel was very positive
and the mobile aspect of environment was seen as a novel
and interesting approach as a research tool. A detailed
evaluation of the effectiveness of the learning environment
has yet to be completed. In prospect, the approach aims at
developing in the learners other language skills, so that
they can express themselves in foreign language.
6. Conclusion
We presented in this paper an original approach for
reading comprehension of English second foreign
language by using web-based application. According to
the study of the experimentation result, we can conclude
that learning by computer doesn't stop evolving, and the
learner finds a simple method of education.
The obtained results supported our hypothesis that claims
that the use of web based application can contribute in
improving the students reading comprehension.
Fig. 4 Fisher results Henceforth, we recommend the generalization of this
new technology in our schools and universities to allow
By applying the Fisher method, one calculates the sum of students take a maximum advantage of it.
square method, SS meth, and the sum of the residual,
SSres to reach the factor Fisher F. The results are available
in table 3 with: Degree of freedom = 3 and the critical
References
point of Fisher F3,16 (0.05) = 3.23: [1] J., Graham, "The reader's helper: a personalized document
From the obtained results we have F > F3,16 (0,05), reading environment", Proceedings of CHI 99, 1999, pp.
therefore, one rejects H0 i.e. the use of laptop is more 481-488.
[2] B. N. Schilit, G. Golovchinsky and M. N. Price, "Beyond
effective than the use of traditional paper sheets.
paper: supporting active reading with free form digital ink
One can note starting from the Fisher's result that the use annotations", Proceedings of CHI '98, 1998, pp. 249-256.
of microcomputer by our learners helped us obtain a [3] G. B.Duggan, S. J., "Payne: How much do we understand
higher performance than working on the traditional when skim reading?", Proceedings of CHI 06, 2006, pp.
730-735.
paper sheet and we noted the collaborative learning with
[4] K. O'Hara, A. Sellen, "A comparison of reading paper and
a help of a micro portable provided us with the better on-line documents". Proceedings of CHI 97, 1997, pp. 335-
performances. 342.
[5] D. Cook, "A new kind of reading and writing space the online
4.3 Limitations course site", The Reading Matrix, Vol.2, No.3, September,
2002.
The present study deals with four year learners poor [6] V. Fernandez, P. Simoa, J. Sallana : "Podcasting: A new
reading performances at the department of Computer technological tool to facilitate good practice in higher
Science at Batna University. Any conclusion drawn from education", Computers & Education, Volume 53, Issue 2,
September,2009, pp. 385-392.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 87
[7] W. Tsou, W. Wang and H. Li, "How computers facilitate Professor at the University of Poitiers, France. He is a member of
English foreign language learners acquire English abstract SIC (Signal, Images and Communications) Research laboratory.
words", Computers & Education, 2002, pp. 415428. He is also a member of IRMA E-learning research group. His PhD
[8] Y.L. Chen, "A mixed-method study of EFL teachers Internet thesis research was in Continuous Speech Recognition. His
current research interest is in E-Learning, Mobile Learning,
use in language instruction", Teaching and Teacher
Computer Supported Cooperative Work and Information Literacy.
Education, 2008, pp. 10151028. His teaching interests include Programming, Data Bases, Artificial
[9] M. Rahimi, S. Yadollahia, "Foreign language learning Intelligence and Information & Communication Technology. He
attitude as a predictor of attitudes towards computer-assisted started and is involved in many research projects which include
language learning", Procedia Computer Science Volume many researchers from different Algerian universities.
3, World Conference on Information Technology, 2011
Pages 167-174. Dr. Mahieddine Djoudi
XLIM-SIC Lab. & IRMA Research Group, University of Poitiers
[10] Turan, "Student Readiness for Technology Enhanced
Bt. Sp2mi, Boulevard Marie et Pierre Curie, BP 30179,
History Education in Turkish High Schools", Cypriot Journal 86962 Futuroscope Cedex France
Of Educational Sciences, 5(2). Retrieved, from Phone: +33 5 49 45 39 89 Fax: +33 5 49 45 38 16
http://www.worldeducationcenter.org/index.php/cjes/article/
view/75, 2010. URL: hhtp://mahieddine.djoudi.online.fr
[11] S. Zidat, S. Tahi and M. Djoudi, "Systme de
comprhension distance du franais crit pour un public
arabophone" , Colloque Euro Mditerranen et Africain
d'Approfondissement sur la FORmation A
Distance ,CEMAFORAD 4, 9, 10 et 11 avril, Strasbourg,
France, 2008.
[12] S. Zidat, M. Djoudi, "Online evaluation of Ibn Sina
elearning environment", Information Technology Journal
(ITJ), ISSN: 1812-5638, Vol. 5, No. 3, 2006, pp. 409-415.
[13] S. Zidat, M. Djoudi, "Task collaborative resolution tool for
elearning environment", Journal of Computer Science, ISSN:
1549-3636, Vol. 2, No. 7, pp. 558-564.
[14] O. Kiddie, T. Marianczak, N. Sandle, L. Bridgefoot, C.
Mistry, D. Williams, D. Corlett, M Sharples. and S. Bull,
"Logbook: The Development of an Application to Enhance
and Facilitate Collaborative Working within Groups in
Higher Education". Proceedings of MLEARN 2004:
Learning Anytime, Everywhere, Rome, 5-6 July.
[15] J.-F. Rouet, A. Goumi, , A. Maniez and A. Raud. "Liralec :
A Web-based resource for the assessment and training of
reading-comprehension skills", In C.P. Constantinou, D.
Demetriou, A. Evagorou, M. Evagourou, A. Kofteros, M.
Michael, Chr. Nicolaou, D. Papademetriou & N. papadouris
(Eds.), Multiple Perspectives on Effective Learning
Environments, 2005, (pp. 113).
conceive an augmented system in which the sensor fault The weighting functions i ( (t )) are non linear and
affecting the initial system appears as an actuator fault. depend on the decision variable (t ) .
The actuator fault is considered as an unknown input. The weighting functions are normalized rules defined as:
Once the fault is estimated, the FTC controller is
implemented as a state feedback controller. In this work T jp1i ( (t ))
i ( (t )) (3)
the observer design and the control implementation can be M
made simultaneously. T jp1 j ( (t ))
j 1
The paper is organized as follows. Section 2 recalls an where i ( (t )) is the grade of membership of the premise
elementary background about the Takagi-Sugeno fuzzy variable (t ) and T denotes a t-norm. The weighting
models (named also multiple models). In section 3 the
functions satisfy the sum convex property expressed in the
proposed method of fault tolerant control design is
presented. The application of the proposed control to the following equations:
M
three tanks system is the subject of section 4.
0 i ( (t )) 1 and i ( (t )) 1 (4)
i 1
2. On the Takagi-Sugeno fuzzy systems If, in the equation which defines the output, we impose
that C1 C2 ... CM C , the output of the model (2) is
TakagiSugeno fuzzy models are non linear systems reduced to: y (t ) Cx(t ) and the Takagi-Sugeno fuzzy
described by a set of ifthen rules which gives local linear
model becomes:
representations of an underlying system [1], [12], [14] and
M
x (t ) i ( (t ))( Ai x(t ) Bi u (t ))
[39] Such models can approximate a wide class of non
linear systems [39]. They can even describe exactly some i 1
(5)
non linear systems [38] and [39].
Each non linear dynamic system can be simply, described y (t ) Cx(t )
by a Takagi-Sugeno fuzzy model [35] and [34]. A Takagi- This model, known also as Takagi-Sugeno multiple
Sugeno fuzzy model is the fuzzy fusion of many linear model, has been initially proposed, in a fuzzy modeling
models [1-3], [12] and [30] each of them represents the framework, by Takagi and Sugeno [34] and in a multiple
local system behavior around an operating point. A model modeling framework in [13] and [29]. This model
Takagi-Sugeno model is described by fuzzy IF-THEN has been largely considered for analysis [29], [34] and [9],
rules which represent local linear input/output relations of modeling [13] and [41], control [21] and [9] and state
the non-linear system [38]. It has a rule base of M rules, estimation [1-3], [12], [22], [23] and [30 ] of non linear
systems.
each having p antecedents, where the i th rule is expressed
as:
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 90
M
x f (t ) i (u f (t ))Ai x f (t ) Bu f (t ) Ef a (t ) (12)
i 1 (7)
y (t ) Cx (t ) Ff (t ) Dw(t )
where X f (t ) is the estimated system state, f (t ) represents
f f s
the estimated fault, Yf (t ) is the estimated output, Ki are
where x f (t ) R n is the state vector, u f (t ) R r is the
the proportional gains of the local observers and Li are
fault tolerant control which will be conceived, y f (t ) R m
their integral gains to be computed and Y (t) Y (t) Y (t) .
f f f
is the output vector. f a (t ) and f s (t ) are respectively the
actuator and sensor faults which are assumed to be The fault tolerant control u f (t ) is conceived on the base of
bounded and w(t ) represents the measurement noise. the strategy described by the following expression [38].
E , F and D are respectively the faults and the noise u f (t ) Sf (t ) G ( X (t ) X f (t )) u (t ) (13)
distribution matrices which are assumed to be known. where S and G are two constant matrices with appropriate
Let us define the following states [15]: dimensions.
M
Let us define X (t ) the error between the states X (t ) and
z(t ) i (u(t))(Az(t) ACx(t))
i 1
(8) X (t ) , X (t ) the estimation error of the state X (t )
f f f
M
z f (t) i (u f (t))(Az(t) ACx(t) Ai Ffs (t) ADw(t)) and f (t ) the fault estimation error :
i 1
X (t ) X (t ) X f (t )
where A is a stable matrix with appropriate dimension.
Defining the two augmented states X (t ) and X f (t ) as: X f (t ) X f (t ) X f (t ) (14)
T T f (t ) f (t ) f (t )
X (t ) x(t )T z (t )T and X f (t ) xTf (t ) zTf (t ) ,
Choosing the matrix S verifying Ea Ba S , the dynamics
these two augmented state vectors can be written:
of X (t ) is given by:
M
X (t ) X (t ) X f (t )
X (t )
i (u (t ))Aai X (t ) Ba u (t )
(9) M (15)
i 1
Y (t ) Ca X (t )
= i (u(t))( Aai BaG) X (t) Ea f (t) BaGX f (t) 1(t)
i 1
and with :
M
X f (t ) i (u f (t ))Aai X f (t ) Ba u f (t ) Ea f (t ) Da w(t )
M
i 1
1 (t ) i (u f (t ) i u(t ))Aai X f (t ) Da w(t ) (16)
i 1
Y (t ) C X (t )
f a f The dynamic of X f (t ) can be written:
(10)
with:
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 91
(t ) (t )
T
X f (t ) X f (t ) X f (t )
(t ) (t ) 0 (26)
M (17)
= i (u(t ))( Aai Ki Ca ) X f (t ) Ea f (t ) 2 (t ) where:
i 1
with : AT P PAm Q P
m (27)
2
M
P Q
2 (t ) i (u f (t) iu(t))( Aai KiCa ) X f (t) Da w(t) (18)
Choosing Q Q I and assume that the Lyapunov
i 1
The dynamic of the fault error estimation is: matrix P has the form: diag ( I , P2 , P3 ) , the matrix is
f (t ) f (t ) f (t ) written :
(19) M
i (u(t ))i
M
i (u(t ))Li Ca X f (t ) 3 (t )
i 1
(28)
i 1
with : where:
M 11i Ba G Ba I 0 0
3 (t ) i (u f (t ) iu(t ))Li Ca X f (t ) Da w(t ) f (t ) (20) T T
G Ba 22i 23i 0 P2 0
i 1
The equations (15), (17) and (19) can be rewritten:
BaT 32i I3 0 0 P3
i (29)
(t ) Am (t ) (t ) (21) I 0 0 1 I 01 0 0
where :
0 P2 0 0 2 I 02 0
X (t ) 1 (t ) 0
M 0 P3 0 0 3 I 03
(t ) X f (t ) , (t ) 2 (t ) and Am i (u (t ))Ami
with:
3 (t ) i 1
f (t ) 11i Aai Ba G Aai
T
GT BaT I1
(22) 22i P2 Aai P2 Ki Ca Aai
T
P2 CaT KiT P2 I 2
where (30)
Aai Ba G Ba G Ba 23i P2 Ba CaT LTi P3
Ami 0 Aai Ki Ca Ba (23) 32i T23
0 Li Ca 0 0 if i 0 i 1...M , the inequalities i 0 are
Considering the Lyapunov function V (t ) (t ) P (t ) , the T bilinear, they can be linearised using the changes of
variables : U 2i P2 Ki and U 3i P3 Li . The observer gains
generalized error vector (t ) converges to zero if V (t ) 0 ,
are then computed using the equations:
V (t ) 0 if Ami
T
P PAmi 0 i 1...M .
K i P21U 2i
The problem of robust state and faults estimation and of (31)
the fault tolerant control design is reduced to find the Li P31U 3i
gains K and L of the observer and the matrix G to Summarizing the following theorem can be proposed:
ensure an asymptotic convergence of the generalized error Theorem:
vector (t ) toward zero if (t ) 0 and to ensure a The system (21) describing the evolution of the errors
X (t ), X f (t ) and f (t ) is stable if there exist symmetric
bounded error in the case where (t ) 0 , i.e.:
definite positive matrices P2 and P3 and matrices
lim (t ) 0 for (t ) 0
t
(24) U 3i , U 2i and G , i 1...M so that the LMI i 0 are
(t ) Q (t ) Q for (t ) 0
verified i 1...M where :
where 0 is the attenuation level. To satisfy the
11i Ba G Ba I 0 0
constraints (13), it is sufficient to find a Lyapunov T T
function V (t ) such that: G Ba 22i 23i 0 P2 0
V (t ) (t )T Q (t ) 2 (t )T Q (t ) 0 BaT 32i I3 0 0 P3
(25) i (32)
I 0 0 1 I 01 0 0
where Q and Q are two positive definite matrices.
0 P2 0 0 2 I 02 0
The inequality (25) can be written:
0 0 0 0 3 I 03
P3
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 92
and: dx1 1/ 2
11i Aai Ba G Aai
T
GT BaT I1 dt 1Sn (2 g ( x1 (t ) x3 (t ))) Q1 (t ) Qf1. f a (t )
22i P2 Aai P2U 2i Aai
T
P2 CaT U 2Ti I 2 dx2 S (2 g ( x (t ) x (t )))1/ 2
(33) dt 3 n 3 2
23i P2 Ba CaT U 3Ti
2 Sn (2 g ( x2 (t ))1/ 2 Q2 (t ) Qf 2 . f a (t ) (34)
32i T23 dx
The observer gains are obtained by: 3 1Sn (2 g ( x1 (t ) x3 (t )))1/ 2 Qf3 . f a (t )
dt
Li P31U 3i and K i P21U 2i
3 Sn (2 g ( x3 (t ) x2 (t )))1/ 2
4. Application to the three tanks system where 1 , 2 and 3 are constants, f a (t ) is the actuator
fault regarded as an unknown input. Qf / fi , i 1...3
The main objective of this part is to show the robustness
of the proposed method by its application to a hydraulic denote the additional mass flows into the tanks caused by
process made up of three tanks [3] and [34]. leaks and g is the gravity constant. The multiple model,
with (t ) u (t ) , which approximates the non linear
system (34), is:
M
x (t )
i ( (t ))( Ai x(t ) Bu (t ) Ef a (t ) di )
(35)
i 1
y (t ) Cx(t ) Ff s (t ) Dw(t )
The matrices Ai , Bi , and di are calculated by linearizing
the initial system (34) around four points chosen in the
operation range of the system. Four local models have
been selected in a heuristic way. That number guarantees a
good approximation of the state of the real system by the
multiple models [3] and [41]. The following numerical
Fig. 1 Three tanks system values were obtained:
0.0109 0 0.0109 2.86
The considered system is affected simultaneously by
A1 0 0.0206 0.0106 , d1 10 0.38
3
sensor and actuator faults. The three tanks T1 , T2 , and T3 0.0109 0.0106 0.0215 0.11
with identical sections , are connected to each others by
cylindrical pipes of identical sections Sn . The output 0.0110 0 0.0110 2.86
A2 0 0.0205 0.0104 , d 2 10 0.34
3
valve is located at the output of tank T2 ; it ensures to
0.0110 0.0104 0.0215 0.038
empty the tank filled by the flow of pumps 1 and 2 with
respectively flow rates Q1 and Q2 . Combinations of the 0.0084 0 0.0084 3.7
three water levels are measured. The pipes of A3 0 0.0206 0.0095 , d3 103 0.14
communication between the tanks are equipped with 0.0084 0.0095 0.0180 0.69
manually adjustable ball valves, which allow the
corresponding pump to be closed or open. The three levels 0.0085 0 0.0085 3.67
x1 , x2 and x3 are governed by the constraint x1 x3 x2 ; A4 0 0.0205 0.0095 , d 4 103 0.18
the process model is given by the equation (33). Indeed, 0.0085 0.0095 0.0180 0.62
taking into account the fundamental laws of conservation 1 0 1 1 1
1
of the fluid, one can describe the operating mode of each Bi 0 1 , C 1 1 0
tank; one then obtains a non linear model expressed by the
0 0 0 1 0
following state equations [3] and [41]
In the following, the functions Qf1 , Qf 2 and Qf3 are
constant, the numerical application are performed with:
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 93
Qfi 104 i 1...4 and t 0, x , g 9.8 , 1 0.78, 37.98 10.65 22.71 36.87 10.1 22.74
30.44
8.71 24.84 L 30.96 8.56 24.05
2 0.78 and 3 0.75 , Sn 5*105 and 0.0154 . L3
32.98 81.68 71.08 4
43.14 106.16 92.72
The two actuator faults signals f a (t ) f a1 (t ) f a 2 (t )
18.38 113.29 47.59 24.36 148.41 62.18
are defined as: 5.24 1.17 14 -4.67 1.20 12.76
sin(0.4 t ), for 15s t 75s 15.32 17.34 39.18 13.85 20.79 40.87
f a1 (t ) and
0, elswere 7.40 1.30 9.80 9.07 1.29 8.01
K1 K2
0.3, for 20 s t 70s 0.47 5.87 3.80 3.88 6.37 5.61
0.08 8.87 2.12 2.36 4.68 4.49
f a 2 (t ) 0.5, for t 70s
0, elswere 4.93 3.29 14.11 5.99 3.977 11.88
3.91 2.67 13.4 3.79 6.85 12.68
It is supposed that a sensor fault f s (t ) is affecting the 10.22 9.01 26.6 36.03
25.26 45.13
system. This fault is defined as follows: 10.14 1.40 7.17 11.86 1.24 5.94
f s (t ) f s1 (t ) f s 2 (t ) with: K3 K4
9.05 5.74 6 13.37 6.66 7.90
0, for t 35s 4.56 0.03 4.28 6.86 3.94 5.54
f s1 (t ) and
7 5.03 9.81 8.47 3.65 5.40
0.6, for t 35s
0, for t 25s 2.44 2.99 1.96 4.35 3.17 7.58
f s 2 (t ) G
sin(0.6 t ), for t 25s 0.53 4.53 7.53 2.52 5.43 1.02
The chosen weighting functions depends on the system The obtained results are shown in figures (3) to (7).
input u (t ) . They have been created on the basis of
Gaussian membership functions. Figure (2) shows their
time-evolution showing that the system is clearly
nonlinear since i , i 1,..., 4 are not constant functions.
Fig. 4. Sensor faults and their estimation Fig. 7. Fault tolerant control uf
is designed such that it can stabilize the faulty plant using Thessaloniki, Greece, June 24-26, 2009.
Lyapunov theory and LMIs. [13] T. A. Johansen, and A.B. Foss, Non linear local
model representation for adaptive systems. Singapore
References International Conference on Intelligent Control and
[1] A. Akhenak, M. Chadli, J. Ragot and D. Maquin, Instrumentation, Singapore, February 17-21, 1992.
Design of observers for Takagi-Sugeno fuzzy models for [14] S. Kawamoto, K. Tada, N. Onoe, A. Ishigame and T.
Fault Detection and Isolation, 7th IFAC Symposium on Taniguchi, Construction of exact fuzzy system for non
Fault Detection, Supervision and Safety of Technical linear system and its stability analysis, 8th Fuzzy System
Processes SAFEPROCESS'09, Barcelona, Spain, June Symposium, Hiroshima, Japan, 1992, pp. 517520.
30th - July 3rd, 2009. [15] A. Khedher, K. Ben Othman, D. Maquin and M.
[2] A. Akhenak M. Chadli J. Ragot and D. Maquin Benrejeb Sensor fault estimation for nonlinear systems
Design of sliding mode unknown input observer for described by Takagi-Sugeno models International Journal
uncertain Takagi-Sugeno model. 15th Mediterranean Transaction on system, signal & devices, Issues on
Conference on Control and Automation, MED'07, Athens, Systems, Analysis & Automatic Control, Vol. 6, No. 1,
Greece, June 27-29, 2007. 2011, pp.1-18.
[3] A. Akhenak, M. Chadli, D. Maquin, and J. Ragot, [16] A. Khedher, K. Ben Othman, D. Maquin and M.
State estimation via multiple observers. The three tank Benrejeb Design of an adaptive faults tolerant control:
system. 5th IFAC Symposium on Fault Detection, case of sensor faults, Wseas Transactions on Systems.
Vol. 9, No. 7, 2010, pp 794-803.
Supervision and Safety for Technical Processes,
[17] A. Khedher, K. Ben Othman, M. Benrejeb and D.
Safeprocess'03, Washington, D.C., USA, June 9-11, 2003.
Maquin Adaptive observer for fault estimation in
[4] S. Beale and B. Shafai Robust control system design
nonlinear systems described by a Takagi-Sugeno model.
with a proportional integral observer, International
18th Mediterranean Conference on Control and
Journal of Control, Vol. 50 No. 1, 1989, pp. 97-111.
Automation, MED'10, June 24-26, Marrakech, Morroco,
[5] M. Blanke, M. Kinnaert, J. Lunze, and M.
2010.
Staroswiecki Diagnosis and Fault-Tolerant Control.
[18] A. Khedher, K. Ben Othman, D. Maquin and M.
Springer-Verlag Berlin Heidelberg. ISBN 3-540-01056-4,
Benrejeb An approach of faults estimation in Takagi-
2003
Sugeno fuzzy systems 8th ACS/IEEE International
[6] J. Chen and R.J. Patton Fault-tolerant control systems
Conference on Computer Systems and Applications
design using the linear matrix inequality approach. 6th
Hammamet Tunisia May 16-19, 2010.
European Control Conference, Porto, Portugal, 4-7
[19] A. Khedher, K. Ben Othman, D. Maquin and M.
September, 2001
Benrejeb Fault tolerant control for nonliner system
[7] C. Edwards, A comparison of sliding mode and
described by Takagi-Sugeno models 8th International
unknown input observers for fault reconstruction. IEEE
Conference of Modeling and Simulation - MOSIM'10 -
Conference on Decision and Control, Vol. 5, 2004, pp.
Hammamet - Tunisia - May 10-12, 2010.
5279-5284.
[20] A. Khedher, K. Ben Othman, D. Maquin and M.
[8] C.Edwards, and S.K Spurgeon, On the development
Benrejeb Active sensor faults tolerant control for Takagi-
of discontinuous observers. International Journal of
Sugeno multiple models. 6th WSEAS International
Control, Vol. 59 No. 5, 1994, pp. 1211-1229.
Conference on dynamical systems & control (CONTROL'
[9] D. Filev, Fuzzy modeling of complex systems.
10) Sousse, Tunisia, May 3-6, 2010.
International Journal of Approximate Reasoning, Vol. 5, [21] A. Khedher and K. Ben Othman, Proportional
N 3, 1991, pp. 281-290. Integral Observer Design for State and Faults Estimation:
[10 ] Y. Guan, and M. Saif, A novel approach to the Application to the Three Tanks System. International
design of unknown input observers, IEEE Trans on Review of Automatic Control Vol. 3, No 2, 2010, pp. 115-
Automatic Control, AC-Vol. 36, No. 5, 1991, pp. 632-635. 124,
[11 ] D. Ichalal, B. Marx, J. Ragot and D. Maquin, New [22] A. Khedher, K. Ben Othman, D. Maquin and M.
fault tolerant control strategy for nonlinear systems with Benrejeb State and sensor faults estimation via a
multiple model approach. Conference on Control and proportional integral observer. 6th international multi-
Fault-Tolerant Systems, SysTol'10, October 6-10, 2010 conference on Systems signals & devices SSD'09 March
[12] D. Ichalal, B. Marx, J. Ragot and D. Maquin, 23-26, Djerba, Tunisia, 2009.
Simultaneous state and unknown inputs estimation with [23] A. Khedher, K. Ben Othman, M. Benrejeb and D.
PI and PMI observers for Takagi-Sugeno model with Maquin State and unknown input estimation via a
unmeasurable premise variables. 17th Mediterranean proportional integral observer with unknown inputs. 9th
Conference on Control and Automation, MED'09,
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 96
key and signature, but the verification of the public key is key. At the same time, signature data is assigned to
accomplished within the signature verification Procedure.
a hidden html element in bill/contract web page
As compared with seal-stamping control designed based whose value is corresponding to a property of page bean.
on the certificate-based public key [5][6], this control is
The maximum size of signature data is 15K, commonly
more efficient for generating and verifying signatures in
4K.
terms of computational efforts and communication costs.
Thus, seal-stamping finished. After saving bill/contract,
After Users/agent staff logins CNBAB, there appears a
organized XML data and signature data is saved into
web page include many menu items according to their
database and the status of bill/contract is updated. When
individual rights. And there is a session bean storing user
agent staff needs to audit bill, only if all seal image is
information, including key-information. There is a table
valid, there is a seal-stamping button in the right position
recording username and corresponding public key in
of page.
database.
When user views a contract, user/agent staff views a bill 2.3 Verify
by
clicking a menu item on web page, corresponding When user/agent staff views seal-stamped web page, all
functional page is opened. According to the status of seal-stamping controls execute verify operation. There is a
bill/contract and the privileges of user/agent staff, web processing step on server side before corresponding
page backend business logic judges whether there is a functional page is opened. Data before signature and after
seal-stamping button on the page. If web page contains signature for every seal-stamping must be retrieved from
seal-stampings, control will verify the validity of every database to verify the validity of signature, stored as two
signature, then valid seal image is showed if passing property of page bean. According to CNBAB SRS [7], at
verify, otherwise invalid seal image. most there are three seal stamps in a web page, commonly
two.
2.2 Web page Seal-Stamping
If signature passes verify, controls in web page show valid
Seal image, otherwise invalid seal image.
There is a processing step on server side before
corresponding functional page is opened. Entire
2.4 Digital Signature
bill/contract pages html data is converted into XML data,
stored as a property of page bean.
A digital signature or digital signature scheme is a
If bill/contract need to be seal-stamped by user/agent staff, mathematical scheme for demonstrating the authenticity of
a seal-stamping control and a seal-stamping button are a digital message or document. A valid digital signature
inserted in the right position of the page. User/agent staff gives a recipient reason to believe that the message was
can trigger the seal-stamping button. After this button created by a known sender, and that it was not altered in
being triggered, control executes following steps transit. Digital signatures are commonly used for software
accomplished on client. distribution, financial transactions, and in other cases
where it is important to detect forgery or tampering.
(1) Examines whether there is a valid USB key on
computer USB interface. If yes, require user /agent staff
Digital signatures are often used to implement electronic
input USB key PIN; if no, prompt user to insert USB key.
signatures, a broader term that refers to any electronic data
that carries the intent of a signature, but not all electronic
(2) Examines whether this USB key is owned by login
signatures use digital signatures. In some countries,
person according to public key information.
including the United States, India, and members of the
European Union, electronic signatures have legal
(3) Reads seal image in USB key, then sign organized
significance. However, laws concerning electronic
XML data (mentioned at the beginning of this section)
signatures do not always make clear whether they are
using private key in USB key, then seal image is inserted
digital cryptographic signatures in the sense used here,
into the web page and floats above the web page
leaving the legal definition, and so their importance,
automatically. Signature data include the signature somewhat confused.
value of organized XML data, seal image and public
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 108
Digital signatures employ a type of asymmetric Integrity: In many scenarios, the sender and receiver of a
cryptography. For messages sent through a nonsecure message may have a need for confidence that the message
channel, a properly implemented digital signature gives has not been altered during transmission. Although
the receiver reason to believe the message was sent by the encryption hides the contents of a message, it may be
claimed sender. Digital signatures are equivalent to possible to change an encrypted message without
traditional handwritten signatures in many respects; understanding it. (Some encryption algorithms, known as
properly implemented digital signatures are more difficult nonmalleable ones, prevent this, but others do not.)
to forge than the handwritten type. Digital signature However, if a message is digitally signed, any change in
schemes in the sense used here are cryptographically the message after signature will invalidate the signature.
based, and must be implemented properly to be effective. Furthermore, there is no efficient way to modify a
Digital signatures can also provide non-repudiation, message and its signature to produce a new message with
meaning that the signer cannot successfully claim they did a valid signature, because this is still considered to be
not sign a message, while also claiming their private key computationally infeasible by most cryptographic hash
remains secret; further, some non-repudiation schemes functions
offer a time stamp for the digital signature, so that even if
the private key is exposed, the signature is valid
nonetheless. Digitally signed messages may be anything Non-repudiation: Non-repudiation, or more specifically
representable as a bit string: examples include electronic non-repudiation of origin, is an important aspect of digital
mail, contracts, or a message sent via some other signatures. By this property an entity that has signed some
cryptographic protocol. information cannot at a later time deny having signed it.
Similarly, access to the public key only does not enable a
fraudulent party to fake a valid signature.
As organizations move away from paper documents with
ink signatures or authenticity stamps, digital signatures
can provide added assurances of the evidence to 2.5 Group Signature
provenance, identity, and status of an electronic document
as well as acknowledging informed consent and approval Based on digital signature scheme, we develop an
by a signatory. The United States Government Printing ActiveX control on client to accomplish seal-stamping and
Office (GPO) publishes electronic versions of the budget, verify. As compared with seal-stamping control designed
public and private laws, and congressional bills with based on the certificate-based public key [8][9], this
digital signatures. Universities including Penn State, control is more efficient for generating and verifying
University of Chicago, and Stanford are publishing signatures in terms of computational efforts and
electronic student transcripts with digital signatures. communication costs. Further, we propose an electronic
seal stamping based on Group signature which overcomes
the disadvantages and retains all merits of the original
Below are some common reasons for applying a digital scheme.
signature to communications:
Group signatures allow individual members to make
signatures on behalf of the group while providing, all
Authentication: Although messages may often include
previously proposed schemes are not very efficient and are
information about the entity sending a message, that
also not to secure.
information may not be accurate. Digital signatures can be
used to authenticate the source of messages. When
Group signatures allow individual members to make
ownership of a digital signature secret key is bound to a
signatures on behalf of the group. Group oriented
specific user, a valid signature shows that the message was
signature is a method to distribute the ability to sign
sent by that user. The importance of high confidence in
among a set of users in such a way that only certain
sender authenticity is especially obvious in a financial
subsets of a group of users can collaborate to produce a
context. For example, suppose a bank's branch office
valid signature on any given message. A group signature
sends instructions to the central office requesting a change
scheme has the following three properties
in the balance of an account. If the central office is not
convinced that such a message is truly sent from an
(1) Only legal member of the group can sign messages.
authorized source, acting on such a request could be a
(2) The receiver can verify that it is indeed a valid group
grave mistake.
signature, but cannot discover which group member made
it.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 109
(3) In the case of a later dispute, the signer can be Unlinkability: It is infeasible to link two different
identified by either the group members together or a group signatures of the same group member.
authority.
Non-framing: No one (including the group manager)
Group signature scheme with signature claiming and can sign a message in such a way that it appears to
variable linkability is a digital signature scheme with three come from another user if it is opened.
types of participants: A group manager, an open authority, Non-appropriation: No one (including the group
and group members. It consists of the following manager) can make a valid claim for signature which
procedures: they did not create.
Join: An interactive protocol between a user and the 3.1 System Model
group manager. The user obtains a group
membership certificate to become a group member. In the system environments, there exists a DUC (Digital
The public certificate and the users identity Authentication Centre). The responsibilities of digital
information are stored by the group manager in a authentication centre are to generate the system parameters
database for future use. and to issue users public keys. Stages of the proposed
signature scheme include the system setup, the
Sign: Using his group membership certificate and his registration, the signature generation and verification.
private key, a group member creates an anonymous
group signature for a message. In the system setup stage, digital authentication centre
generates system parameters, including digital
Verify: A signature is verified to make sure it authentication centres private key and public key pair. In
originates from a legitimate group member without the registration stage, digital authentication centre deals
the knowledge of which particular one. with the registration requests submitted by a registering
user for issuing self certified public keys. After that,
Open: Given a valid signature, an open authority digital authentication centre publishes all self-certified
discloses the underlying group membership public keys and sends each user a witness. Note that
certificate. digital authentication centre does not need to generate any
certificates for these public keys. With the received
Claim (Self-trace): A group member creates a proof witness and the secret shadow, each user can solely
that he created a particular signature. compute his private key.
Step 1: Ui chooses a random integer j in Step 3: The verifier randomly selects an integer k in Z*p
Z*p ( j [1, p 2]), j is co-prime with p 1, computes and sends it to Ui.
4. Signature Generation & Verification If it holds, then (R, S) is a valid group signature of M
signed by G with the self certified public key YG [11],
4.1 Signature Generation [12], [13].
Keywords: Image Fusion; Pixel-Based Fusion; Brovey The term fusion gets several words to appear,
Transform; Color Normalized; High-Pass Filter ; such as merging, combination, synergy, integration
and several others that express more or less the
Modulation, Wavelet transform.
same concept have since appeared in literature [6].
Different definitions of data fusion can be found in
1. INTRODUCTION literature, each author interprets this term differently
depending his research interests, such as [7-8] . A
Although Satellites remote sensing image fusion general definition of data fusion can be adopted as
has been a hot research topic of remote sensing image fallowing Data fusion is a formal framework which
processing [1]. This is obvious from the amount of expresses means and tools for the alliance of data
conferences and workshops focusing on data fusion, originating from different sources. It aims at
as well as the special issues of scientific journals obtaining information of greater quality; the exact
dedicated to the topic. Previously, data fusion, and in definition of greater quality will depend upon the
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 114
application [11-13]. Image fusion forms a subgroup as DN for PAN image, F the DN in final fusion
within this definition and aims at the generation of a result for band k. M P , P , M Denotes the local
single image from multiple image data for the means and standard deviation calculated inside the
extraction of information of higher quality. Having window of size (3, 3) for M and respectively.
that in mind, the achievement of high spatial
resolution, while maintaining the provided spectral 3. The AC Methods
resolution, falls exactly into this framework [14].
This category includes simple arithmetic
2. Pixel-Based Image Fusion Techniques techniques. Different arithmetic combinations have
been employed for fusing MS and PAN images. They
Image fusion is a sub area of the more general directly perform some type of arithmetic operation on
topic of data fusion [15]. Generally, Image fusion the MS and PAN bands such as addition,
techniques can be classified into three categories multiplication, normalized division, ratios and
depending on the stage at which fusion takes place; it subtraction which have been combined in different
is often divided into three levels, namely: pixel level, ways to achieve a better fusion effect. These models
feature level and decision level of representation [16, assume that there is high correlation between the
17] . This paper will focus on pixel level image PAN and each of the MS bands [24]. Some of the
fusion. The pixel image fusion techniques can be popular AC methods for pan sharpening are the BT,
grouped into several techniques depending on the CN and MLM. The algorithms are described in the
tools or the processing methods for image fusion following sections.
procedure. It is grouped into four classes: 1)
Arithmetic Combination techniques (AC) 2) 3.1 Brovey Transform (BT)
Component Substitution fusion techniques (CS) 3)
Frequency Filtering Methods (FFM) 4) Statistical
The BT, named after its author, uses ratios to
sharpen the MS image in this method [18]. It was
Methods (SM). This paper focuses on using tow
created to produce RGB images, and therefore only
types of pixel based image fusion techniques
three bands at a time can be merged [19]. Many
Arithmetic Combination and Frequency Filtering
researchers used the BT to fuse a RGB image with a
Methods of Pixel-Based Image Fusion Techniques.
high resolution image [20-25].The basic procedure of
The first type is included BT; CN; MLT and the last
the BT first multiplies each MS band by the high
type includes HPFA; HFA HFM and WT. In this
resolution PAN band, and then divides each product
work to achieve the fusion algorithm and estimate the
by the sum of the MS bands. The following equation,
quality and degree of information improvement of a
given by [18], gives the mathematical formula for the
fused image quantitatively used programming in VB.
BT:
, ,
To explain the algorithms through this report, Pixels ,
(1)
,
should have the same spatial resolution from two
different sources that are manipulated to obtain the The BT may cause color distortion if the spectral
resultant image. So, before fusing two sources at a range of the intensity image is different from the
pixel level, it is necessary to perform a geometric spectral range covered by the MS bands.
registration and a radiometric adjustment of the
images to one another. When images are obtained 3.2 Color Normalized Transformation (CN)
from sensors of different satellites as in the case of
fusion of SPOT or IRS with Landsat, the registration CN is an extension of the BT [17]. CN transform
accuracy is very important. But registration is not also referred to as an energy subdivision transform
much of a problem with simultaneously acquired [26]. The CN transform separates the spectral space
images as in the case of Ikonos/Quickbird PAN and into hue and brightness components. The transform
MS images. The PAN images have a different spatial multiplies each of the MS bands by the p imagery,
resolution from that of MS images. Therefore, and these resulting values are each normalized by
resampling of MS images to the spatial resolution of being divided by the sum of the MS bands. The CN
PAN is an essential step in some fusion methods to transform is defined by the following equation [26,
bring the MS images to the same size of PAN, , thus 27]:
the resampled MS images will be noted by that
represents the set of DN of band k in the resampled , . , . .
,
1.0 (2)
MS image . Also the following notations will be used: , .
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 115
(Note: The small additive constants in the equation square box HP filters. For example, a 3*3 pixel
are included to avoid division by zero.) kernel given by[36], which is used in this study:
1 1 1
3.3 Multiplicative Method (MLT) 1 8 1 (4)
1 1 1
In its simplest form, The HP filter matrix is occupied
The Multiplicative model or the product fusion by -1 at all but at the center location. The center
method combines two data sets by multiplying each value is derived by , where is the
pixel in each band k of MS data by the corresponding center value and is the size of the filter box
pixel of the PAN data. To compensate for the [28]. The HP are filters that comput a local average
increased brightness, the square root of the mixed around each pixel in the PAN image.
data set is taken. The square root of the The extracted high frequency components of
Multiplicative data set, reduce the data to superimposed on the MS image [1] by simple
combination reflecting the mixed spectral properties addition and the result divided by two to offset the
of both sets. The fusion algorithm formula is as increase in brightness values [33]. This technique can
follows [1; 19 ; 20]: improve spatial resolution for either colour
composites or an individual band [16]. This is given
, , , (3)
by [33]:
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 116
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 117
Step (3): Similarly by decomposing the panchromatic * 105 pixels at 8 bits per pixel, but this is upsampled
high-resolution image we will have one to by Nearest neighbor was used to avoid spectral
approximation coefficients, ( ) and 3N wavelets contamination caused by interpolation.
Planes for Panchromatic image, where PAN means,
panchromatic image.
Step (4): the wavelet coefficients sets from two To evaluate the ability of enhancing spatial details
images are combined via substitutive or additive and preserving spectral information, some Indices
rules. In the case of substitutive method, the wavelet including Standard Deviation (SD), Entropy En),
coefficient planes (or details) of the R, G, and B Correlation Coefficient (CC), Signal-to Noise Ratio
decompositions are replaced by the similar detail (SNR), Normalization Root Mean Square Error
planes of the panchromatic decomposition, which (NRMSE) and Deviation Index (DI) of the image
that used in this study. were used (Table 1), and the results are shown in
Step (5): Then, for obtaining the fused images, the Table 2. In the following sections, F , M are the
inverse wavelet transform is implemented on measurements of each the brightness values of
resultant sets.By reversing the process in step (2) the homogenous pixels of the result image and the
synthesis is equation [54]: original multispectral image of band k, M and F are
the mean brightness values of both images and are of
size n m . BV is the brightness value of image
data M and F .To simplify the comparison of the
different fusion methods, the values of the En, CC,
(12) SNR, NRMSE and DI index of the fused images are
provided as chart in Fig. 1
5. Experiments , ,
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 118
8
6. Discussion Of Results 6
4
The Fig. 1 shows those parameters for the fused 2 En
0
images using various methods. It can be seen that
from fig.1a. The SD of the fused images remains 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
constant for HFA and HFM. According to the
computation results En, the increased En indicates ORIGIN BT CN MLT HPFA HFA HFM WT
the change in quantity of information content for
radiometric resolution through the merging. From
fig.1b, it is obvious that En of the fused images have
Fig. 1b: Chart Representation of En of Fused Images
been changed when compared to the original
multispectral but some methods such as (BT and
HPFA) decrease the En values to below the original. 1.5
In Fig.1c.Correlation values also remain practically 1
constant, very near the maximum possible value
0.5
except BT and CN. The results of SNR, NRMSE and CC
DI appear changing significantly. It can be observed, 0
from the diagram of Fig. 1., that the results of 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
NRMSE & DI, of the fused image, show that the HFM
and HFA methods give the best results with respect BT CN MLT HPFA HFA HFM WT
to the other methods indicating that these methods
maintain most of information spectral content of the
original multispectral data set which get the same
Fig. 1c: Chart Representation of CC of Fused Images
values presented the lowest value of the NRMSE &
DI as well as the higher of the SNR. Hence, the
spectral qualities of fused images by HFM and HFA 10
methods are much better than the others. In contrast, 8
it can also be noted that the BT, HPFA images
produce highly NRMSE & DI values indicate that 6 SNR
these methods deteriorate spectral information 4
content for the reference image. In a comparison of
2
spatial effects, it can be seen that the results of the
HFM; HFA; WT and CN are better than other 0
methods. Fig.3. shows the original images and the 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
fused image results.
By combining the visual inspection results, it can be BT CN MLT HPFA HFA HFM WT
seen that the experimental results overall method are
The HFM and HFA results which are the best result. Fig. 1d: Chart Representation of SNR of Fused Images
The next higher the visual inspection results are
obtained with WT, CN and MUL.
0.7
0.6 NRMSE DI
60
50 0.5
40 0.4
30 0.3
20 0.2
10 0.1
0 SD
0
1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 119
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 120
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 121
[19] Vab A.and Otir K., 2006. High-Resolution Image Forestry Applications. International Archives Of
Fusion: Methods To Preserve Spectral And Spatial Photogrammetry And Remote Sensing, Vol. 32, Part
Resolution. Photogrammetric Engineering & Remote 7-4-3 W6, Valladolid, Spain, 3-4 June.
Sensing, Vol. 72, No. 5, May 2006, pp. 565572. [33] Carter, D.B., 1998. Analysis of Multiresolution Data
[20] Parcharidis I. and L. M. K. Tani, 2000. Landsat TM Fusion Techniques. Master Thesis Virginia
and ERS Data Fusion: A Statistical Approach Polytechnic Institute and State University, URL:
Evaluation for Four Different Methods. 0-7803-6359- http://scholar.lib.vt.edu/theses/available /etd-32198
0/00/ 2000 IEEE, pp.2120-2122. 21323/unrestricted/Etd.pdf (last date accessed: 10
[21] Ranchin T., Wald L., 2000. Fusion of high spatial and May 2008).
spectral resolution images: the ARSIS concept and its [34] Aiazzi B., S. Baronti , M. Selva,2008. Image fusion
implementation. Photogrammetric Engineering and through multiresolution oversampled
Remote Sensing, Vol.66, No.1, pp.49-61. decompositions. in Image Fusion: Algorithms and
[22] Prasad N., S. Saran, S. P. S. Kushwaha and P. S. Roy, Applications .Edited by: Stathaki T. Image Fusion:
2001. Evaluation Of Various Image Fusion Algorithms and Applications. 2008 Elsevier Ltd.
Techniques And Imaging Scales For Forest Features [35] Lillesand T., and Kiefer R.1994. Remote Sensing And
Interpretation. Current Science, Vol. 81, No. 9, Image Interpretation. 3rd Edition, John Wiley And
pp.1218 Sons Inc.,
[23] Alparone L., Baronti S., Garzelli A., Nencini F. , 2004. [36] Gonzales R. C, and R. Woods, 1992. Digital Image
Landsat ETM+ and SAR Image Fusion Based on Processing. A ddison-Wesley Publishing Company.
Generalized Intensity Modulation. IEEE Transactions [37] Umbaugh S. E., 1998. Computer Vision and Image
on Geoscience and Remote Sensing, Vol. 42, No. 12, Processing: Apractical Approach Using CVIP tools.
pp. 2832-2839 Prentic Hall.
[24] Dong J.,Zhuang D., Huang Y.,Jingying Fu,2009. [38] Green W. B., 1989. Digital Image processing A system
Advances In Multi-Sensor Data Fusion: Algorithms Approach.2nd Edition. Van Nostrand Reinholld, New
And Applications . Review , ISSN 1424-8220 York.
[39] Sangwine S. J., and R.E.N. Horne, 1989. The Colour
Sensors 2009, 9, pp.7771-7784.
Image Processing Handbook. Chapman & Hall.
[25] Amarsaikhan D., H.H. Blotevogel, J.L. van Genderen, [40] Gross K. and C. Moulds, 1996. Digital Image
M. Ganzorig, R. Gantuya and B. Nergui, 2010. Processing. (http://www.net/Digital Image
Fusing high-resolution SAR and optical imagery for Processing.htm). (last date accessed: 10 Jun 2008).
improved urban land cover study and classification. [41] Jensen J.R., 1986. Introductory Digital Image
International Journal of Image and Data Fusion, Vol. Processing A Remote Sensing Perspective.
1, No. 1, March 2010, pp. 8397. Englewood Cliffs, New Jersey: Prentice-Hall.
[26] Vrabel J., 1996. Multispectral imagery band [42] Richards J. A., and Jia X., 1999. Remote Sensing
sharpening study. Photogrammetric Engineering and Digital Image Analysis. 3rd Edition. Springer -
Remote Sensing, Vol. 62, No. 9, pp. 1075-1083. verlag Berlin Heidelberg New York.
[27] Vrabel J., 2000. Multispectral imagery Advanced [43] Cao D., Q. Yin, and P. Guo,2006. Mallat Fusion for
band sharpening study. Photogrammetric Multi-Source Remote Sensing Classification.
Engineering and Remote Sensing, Vol. 66, No. 1, pp. Proceedings of the Sixth International Conference on
73-79. Intelligent Systems Design and Applications
[28] Gangkofner U. G., P. S. Pradhan, and D. W. Holcomb, (ISDA'06)
2008. Optimizing the High-Pass Filter Addition [44] Hahn M. and F. Samadzadegan, 1999. Integration of
Technique for Image Fusion. Photogrammetric DTMS Using Wavelets. International Archives Of
Engineering & Remote Sensing, Vol. 74, No. 9, pp. Photogrammetry And Remote Sensing, Vol. 32, Part
11071118. 7-4-3 W6, Valladolid, Spain, 3-4 June. 1999.
[29] Wald L., T. Ranchin and M. Mangolini, 1997. Fusion [45] King R. L. and Wang J., 2001. A Wavelet Based
of satellite images of different spatial resolutions: Algorithm for Pan Sharpening Landsat 7 Imagery. 0-
Assessing the quality of resulting images, 7803-7031-7/01/ 02001 IEEE, pp. 849- 851
Photogrammetric Engineering and Remote Sensing, [46] Kumar Y. K.,. Comparison Of Fusion Techniques
Vol. 63, No. 6, pp. 691699. Applied To Preclinical Images: Fast Discrete Curvelet
[30] Li J., 2001. Spatial Quality Evaluation Of Fusion Of Transform Using Wrapping Technique & Wavelet
Different Resolution Images. International Archives Transform. Journal Of Theoretical And Applied
of Photogrammetry and Remote Sensing. Vol. Information Technology. 2005 - 2009 Jatit, pp. 668-
XXXIII, Part B2, Amsterdam 2000, pp.339-346. 673
[31] Aiazzi, B., L. Alparone, S. Baronti, I. Pippi, and M. [47] Malik N. H., S. Asif M. Gilani, Anwaar-ul-Haq, 2008.
Selva, 2003. Generalised Laplacian pyramid-based Wavelet Based Exposure Fusion. Proceedings of the
fusion of MS + P image data with spectral distortion World Congress on Engineering 2008 Vol I WCE
minimization.URL:http://www.isprs.org/ 2008, July 2 - 4, 2008, London, U.K
commission3/ proceedings02/papers/paper083.pdf [48] Li S., Kwok J. T., Wang Y.., 2002. Using The Discrete
(Last date accessed: 8 Feb 2010). Wavelet Frame Transform To Merge Landsat TM And
[32] Hill J., C. Diemer, O. Stver, Th. Udelhoven, 1999. A SPOT Panchromatic Images. Information Fusion 3
Local Correlation Approach for the Fusion of Remote (2002), pp.1723.
Sensing Data with Different Spatial Resolutions in
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 122
[49] Garzelli, A. and Nencini, F., 2006. Fusion of has More than 60 scientific papers published in scientific journals
panchromatic and multispectral images by genetic in several scientific conferences.
Algorithms. IEEE Transactions on Geoscience and
Remote Sensing, 40, 38103813.
[50] Aiazzi, B., Baronti, S., and Selva, M., 2007. Improving
component substitution pan-sharpening through
multivariate regression of MS+Pan data. IEEE
Transactions on Geoscience and Remote Sensing,
Vol.45, No.10, pp. 32303239.
[51] Das A. and Revathy K., 2007. A Comparative
Analysis of Image Fusion Techniques for Remote
Sensed Images. Proceedings of the World Congress
on Engineering 2007, Vol. I, WCE 2007, July 2
4,London, U.K.
[52] Pradhan P.S., King R.L., 2006. Estimation of the
Number of Decomposition Levels for a Wavelet-
Based Multi-resolution Multi-sensor Image Fusion.
IEEE Transaction of Geosciences and Remote
Sensing, Vol. 44, No. 12, pp. 3674-3686.
[53] Hu Deyong H. L., 1998. A fusion Approach of Multi-
Sensor Remote Sensing Data Based on Wavelet
Transform. URL:
http://www.gisdevelopment.net/AARS/ACRS1998/Di
gital Image Processing (last date accessed: 15 Feb
2009).
[54] Li S.,Li Z.,Gong J.,2010.Multivariate statistical
analysis of measures for assessing the quality of image
fusion. International Journal of Image and Data
Fusion Vol. 1, No. 1, March 2010, pp. 4766.
[55] Bhler W. and G. Heinz, 1998. Integration of high
Resolution Satellite Images into Archaeological
Docmentation. Proceeding International Archives of
Photogrammetry and Remote Sensing, Commission V,
Working Group V/5, CIPA International Symposium,
Published by the Swedish Society for Photogrammetry
and Remote Sensing, Goteborg. (URL:
http://www.i3mainz.fh-mainz.de/publicat/cipa-98/sat-
im.html (Last date accessed: 28 Oct. 2000).
AUTHORS
Mrs. Firouz Abdullah Al-Wassai1.Received the B.Sc. degree in,
Physics from University of Sanaa, Yemen, Sanaa, in 1993. The
M.Sc.degree in, Physics from Bagdad University , Iraqe, in 2003,
Research student.Ph.D in thedepartment of computer science
(S.R.T.M.U), India, Nanded.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 123
2
School of Electrical and Computer Science, Shiraz University, Iran
Fig. 1. Fuzzy Decision-making structure[6] prioritizes the accommodations, using Fuzzy inference
methods.
2-1. Step 1: Fuzzification
3-1.Tourist decision-making criteria (system
The first step in fuzzy decision-making process is making input)
fuzzy real (imminent) variables, in which absolute
variables are converted to linguistic variables. This step is In decision models, criteria selection process is
called Fuzzification, since fuzzy sets are used to convert accomplished according to decision objectives [9, 10]. In
real (imminent) variables to fuzzy variables [6]. Fuzzy this case study, the following factors are considered:
membership functions are needed for this purpose.
Accommodation cost for one night in Shiraz, using
2-1-1. Chart and fuzzy membership functions fuzzy charts, are converted to linguistic values of
cheap, moderate and expensive.
A fuzzy set is described by its membership functions. A
Importance of each accommodation facilities. In this
more precise way to define a membership function is
case study, we considered eight facilities. This criteria is
expressing it as a mathematical formula. Several different
converted to linguistic values of low, medium and
classes of membership parametric function are introduced,
high.
and in real world of fuzzy sets applications, membership
function shapes usually are restricted to a definite class of Distance from historical sight in downtown Shiraz
functions that can clarify with few parameters [7]. Most (origin: Arg-e Karimkhan)
famous shapes are Triangular, Trapezoidal and Gaussian Distance from business center (origin: Setareh-Fars
shaped as shown in figure 2. shopping center)
Distance from cultural attractions (origin: Hafeziyeh)
Distance from pilgrimage center (origin: Shah-
Cheragh)
Distance from academic center (origin: Shiraz
University - Faculty of Engineering, building No. 1)
Fig. 2. Fuzzy Decision-making structure[6] All of distances are converted to linguistic parameters of
far, average and near, using the relations of fuzzy
2-2. Step 2: Fuzzy inference charts.
In this step, the behavior of a system is defined using a set 3-2.Data Collection
of if then rules. Result of this inference will be a
linguistic value for a linguistic variable[6]. In the case One of the required data for calculation in fuzzy decision-
study section, you can find more explanation about Fuzzy making is accommodations' price list, facilities and
inference way. distance from the desired origin. Membership matrix is
derived from these data and is kept for the next
2-3. Step 3: De-Fuzzification calculation. In this case study, all distances are calculated
based on the newest map of Shiraz and in kilometers.
In the third step (making definite), linguistic values will be Figure 3 shows the price of accommodations and distance
changed to definite numbers in order to do decision- from different spots in the website admin panel, where this
making [6]. information can be easily modified using related forms.
3. Case Study
This is a case study on Shiraz e-tourism. The goal is to
suggest the best accommodation to tourists, based on their
preferences. The tourists enter their preferences for
accommodation, including their budget, residence
facilities and desired distance from sights of visit. System
does the Fuzzification of tourists' desired values, including
budget, facilities and distance to the spots, and then
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 125
Fig. 3. Reporting of real hotels cost and their distances from different
centers in admin panel of website
(1)
(2)
Membership Function for linguistic variable:
High:
(3)
Fig. 7. Request form of tourist
These membership functions are configured for different
To increase options for distance, at first users can choose
parameters of accommodation budget, distances, and
from the categories of visiting spots including cultural,
facilities, with different bordering values. Figure 4, 5 and
historical, pilgrimage, commercial or academic spots.
6 show these functions, drawn in MatLab.
Then, the users select distance from these spots. Minimum
and maximum distance values and prices are adjusted by
3-3-1-2.Shiraz tourism fuzzy charts the website administrator. Users have to register to the
website to be able to access to this information and follow
their requests.
at first it is needed to define a set of if then rules, user's request have more priority. The advantage of this
which are defined as follow in this research. method is independency to fuzzy rules as well as
simplicity, and priority of all hotels will be computed. It
3-3-2-1. Fuzzy Rules should be noted that all three different parameters (cost,
distance and facilities) must be changed to fuzzy-value
Fuzzy rules are defined and adjusted by the administrator and also normalized. In other words, fuzzification is
or a skilled expert. Website software has the ability to required and since the number of decision criteria is more
create, modify or delete the conditions. Figure 8 shows a than one, we cannot apply it on non-fuzzy values.
part of adjusted fuzzy conditions report in admin panel of
the website. In general, if A and B are from a world debate X, we can
define the distance between A and B using Minkovsky
rule as follows:
(5)
Considering P 1 [7].
factors increases, the number of fuzzy rules will be much [7] Mehmed Kantardzik, Data mining, translated by Amir
more. However, according to skilled experts' opinion, Alikhanzadeh, Iran, Oloom Rayaneh Publication. 2006 [In Farsi].
many of these rules may not be used, but the rules will [8] Zadeh, L.A. and Bellman, R.E. ,Decision-making in a fuzzy
environment, management science, Vol. 17, No.4, pp.141-
increase and need to be analyzed.
164,1970.
[9] Ahmad Jamali, and Mahmud Saremi, Using Fuzzy Multi
3. Computational complexity of the method of calculating
Attribute decision-making model for selection of foreign
the distance is much less than the standard method. investment method in the high managerial deputies of Oil
Industry in Iran., Quarterly research of trade. No 29. P. 167-
4. In terms of development capability and generalization, 188, 2003.
if the number of factors increases, calculating the distance
method will be quite responsive and easy and just by Z. Hamedi was born in Iran. She receives B.Sc degree in
putting in a formula, priorities will be determined. But, if computer engineering from Shahid Bahonar Kerman univercity in
we want to generalize the users' input values on an 2000. She is currently a M.Sc. student in Information Technology
at Shiraz University. Her research interests include Information
interval, then Max-Min or Max-Prod method has more Technology, telecommunications and computer networks.
capability and flexibility.
S. Jafari received the PhD degree in Computer Systems
Conclusion Engineering from Monash University, Australia in 2006. He is
currently a lecturer in the Electrical and Computer Engineering
In this research, as a new work, we tried to apply fuzzy School, Shiraz University, Shiraz, Iran. His research interests
include Artificial Intelligence, especially Expert Systems Robotics,
decision-making in e-tourism industry and in order to Hybrid Systems: Fuzzy logic, Certainty Factor, Neural Network
increase research integrity, we focused on Shiraz city. Our combinations: Neuro-Fuzzy, Fuzzy neural Networks and Image
goal was to find a simple and applicable way. So, we used Processing.
two methods; one of them is the usual method for fuzzy
decision-making, and the other is Euclidean distance
method, which is very simple in calculation. After
inspecting both methods, we selected the usual method as
the main method of fuzzy inference in our website.
Distance method also can present a complete list of all
accommodations and their priorities to the users. The most
important result in this research is providing a system of e-
tourism in which tourists can enter their interests and
needs without conflicting binary systems and receive an
appropriate suggestion to plan their traveling to Shiraz.
The results of this research show that in an e-tourism
system using fuzzy decision-making is more efficient.
References
[1] M. Moharrer, and T. Tahayori, Drivers of customer
convenience in electronic tourism industry, Canadian
Conference on Electrical and Computer Engineering (CCECE)
IEEE, p. 836, 2007.
[2] Waralak V. and Siricharoen, E-commerce adaptation using
ontologies for E-tourism, IEEE International Symposium on
Communication and information Technologies, 2007.
[3] Laura Sebastia and Inma Garcia and Eva Onaindia and Cesar
Guzman, e-Tourism: a tourist recommendation and planning
application, 20th IEEE International Conference on Tools with
Artificial Intelligence, 2008.
[4] Mazyar Yari, and Hosein Vazifehdust, Electronic tourism:
the interaction between e-commerce and tourism industry, Iran,
4th national conference of e-commerce. 2007
[5] Valente de, and Oliveria J., Semantic constraints for
membership function optimization, IEEE Transaction on Fuzzy
System 19, 128-138, 1999.
[6] Adel Azar, and Hojjat Faraji, Fuzzy Management Science,
Iran, Ketab Mehraban Nashr Institute. 2008 [In Farsi]
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 128
are forwarded to destination through reliable intermediate from a source node to a destination node in a MANET.
nodes[13].In this paper, we propose a reliable routing This route must also satisfy certain bandwidth
algorithm based on fuzzy logic. In this scheme for each requirements. They determine the route expiration time
node we determine two parameters , trust value and energy (RET) between two connected mobile nodes by using
value, to calculate the lifetime of routes . During route global positioning system (GPS). Then use two
discovery, every node inserts its trust value and energy parameters, the route expiration time and the number of
value in RREQ packet .In the destination , based on a new hops, to select a routing path with low latency and high
single parameter which is called reliability value , is stability.
decided which route is selected. The route with higher some other proposed protocols are considering energy and
reliability value is candidated to route data packets from trust evaluation as a factor of reliability . In [17], an
source to destination. approach has been proposed in which the intermediate
The rest of the paper is organized as follows: In Section 2, nodes calculate cost based on battery capacity. The
we briefly describe the related work. Section 3 describes intermediate node take into consideration whether they
our proposed routing algorithm and its performance is can forward RREQ packet or not . This protocol improves
evaluated in Section 4.Finally,Section 5 concludes the packet delivery ratio and throughput and reduces nodes
paper. energy consumption[13].In [18], Gupta Nishant and Das
Samir had proposed a method to make the protocols
energy aware .They were using a new function of the
2. Related Works remaining battery level in each node on a route and
number of neighbours of the node. This protocol gives
We can classify all the works that have been done in significant benefits at high traffic but at low mobility
reliable routing, in three categories: GPS-aided protocols scenarios[13].In [19], a novel method has been discussed
,energy aware routing ,and trust evaluation methods .In for maximizing the life span of MANET by integrating
this section, we will overview some proposed protocols load balancing and transmission power control approach.
that have been given to designing reliable routing The simulation results of this mechanism showed that the
protocols. average required transmission energy per packet was
A reliable path has more stability than a command path. reduced in comparison with the standard AODV. In [20]
Some of reliable routing protocols propose a GPS-aided Pushpalatha & Revathy have proposed a trust model in
process and use route expiration time to select a reliable DSR protocol that categorize trust value as friend,
path. In [14] Nen-chung Wang et al, propose a stable acquaintance and stranger based on the number of packets
weight-based on-demand routing protocol (SWORP) for transferred successfully by each node[13].The most
MANETs. The proposed scheme uses the weight-based trusted path was determined from source to
route strategy to select a stable route in order to enhance destination.Results indicated that the proposal had a
system performance .The weigth of a route is decided by minimum packet loss when compared to the conventional
three factors:the route expiration time, the error count , DSR.Huafeng Wu & Chaojian Shi1 [21] has proposed the
and the hop count . Route discovery usually first finds trust management model to get the trust rating in peer to
multiple routes from the source node to the destination peer systems, and aggregation mechanism is used to
node. Then the path with the largest weigth value for indirectly combine and obtain other nodes trust rating[13].
routing is selected . The result shows that the trust management model can
In [15], Nen-Chung Wang and Shou-Wen Chang also quickly detect the misbehaviour nodes and limit the
propose a reliable on-demand routing protocol (RORP) impacts of them in a peer to peer file sharing
with mobility prediction. In this scheme, the duration of system[13].all above papers used the separate parameters
time between two connected mobile nodes is determined such as battery power ,trust of a node or route expiration
by using the global positioning system (GPS) and a time individually as a factor for measuring reliability of
request region between the source node and the route. In this paper, we consider both energy capacity and
destination node is discovered for reducing routing trust of nodes for route discovery .
overhead. the routing path with the longest duration of
time for transmission is selected to increase route
reliability. In [16], Neng-Chung Wang etal, propose a
reliable multi-path QoS routing (RMQR) protocol for 3. Proposed Model
MANETs by constructing multiple QoS paths from a
source node to a destination node. The proposed protocol In this section we propose our novel reliable routing
is an on-demand QoS aware routing scheme. They algorithm which is improved version of [22].
examine the QoS routing problem associated with
searching for a reliable multi- path (or uni-path) QoS route
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 130
3.1 RRAF Mechanism membership function and Fig.3 shows the membership
function of reliability value.
Trust value and battery capacity are the two main
parameters in this method that make the routing algorithm
more reliable. Before explaining the algorithm, trust
estimation and power consumption mechanism are
described below.
Trust Evaluation: Trust value of each node is measured
based on the various parameters like length of the
association, ratio of number of packets forwarded
successfully by the neighbors to the total number of
packets sent to that neighbor and average time taken to
respond to a route request [13,20]. Based on the above
parameters trust level of a node i to its neighbor node j can Fig. 1 Membership function for trust value.
be any of the following types:
a)Node i is a stranger to neighbor node j
Node i have never sent/received message to/from node
j .Their trust levels between each other will be low. Every
new node which is entering an ad hoc network will be a
stranger to all its neighbors.
b) Node i is an acquaintance to neighbor node j
Node i have sent/received few messages from node j.
Their trust levels are neither too low nor too high to be
reliable. Fig. 2 Membership function for energy value.
c) Node i is a friend to neighbor node j
Node i have sent/received a lot of messages to/ from node
j. The trust levels between them are reasonably high .
The above relationships are represented in Fig.1 as a
membership function.
Energy Evaluation: We defined that every node is in
high level which means it has full capacity (100%).The
node will not be a good router to forward the packets If Fig .3 Membership function for Reliability value.
the energy of it falls below 50%.
FuzzyLogic Controller: A useful tool for solving hard
optimization problems with potentially conflicting Reliability Evaluation: Reliability factor take different
objectives is fuzzy logic. values based on six rules that dependent upon varied
In fuzzy logic, values of different criteria are mapped input metric values i.e. energy and trust values .A fuzzy
into linguistic values that characterize the level of system decides for each two input values which values
satisfaction with the numerical value of the objectives. appear in output.
The numerical values are chosen typically to operate in the The fuzzy system with product inference engine ,
interval [0, 1] according to the membership function of singleton fuzzifier and center average defuzzifier are of
each objective .Fig.1 represents the trust value the following form:
membership function. According to three types of trust
value: friend , acquaintance and stranger, we define three 6 l 2
fuzzy sets :high, medium and low, respectively. we also
y A xi l
determined three fuzzy sets for node's energy. For energy
f ( x) i 1 i 1 i
capacity between 50 % to 100% of total capacity ,we 6 2
define high set, for 0% to 100% we define medium set and
for 50% to 100% we define low set. The above Ali xi
i 1 i 1
relationships are represented in Fig.2 as energy value
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 131
In Eq.1 , represents crisp input ith (energy or trust End to End Delay : Average end to end delay is the delay
values), A
x i represents fuzzy membership
l
i
experienced by the successfully delivered packets in
l reaching their destinations. This is a good metric for
function for input ith , and y is center average of output
comparing protocols and denotes how efficient the
fuzzy set lth . underlying routing algorithm is, because delay primarily
The rules are as follows: depends on optimality of path chosen[23].
Rule1: if trust value is high and energy value is high then Throughput: It is defined as rate of successfully
reliable value is very very high. transmitted data per second in the network during the
Rule2: if trust value is medium and energy value is high simulation. Throughput is calculated such that , it is the
then reliable value is very high. sum of successfully delivered payload sizes of data
Rule3: if trust value is high and energy value is medium packets within the period , which starts when a source
then reliable value is high. opens a communication port to a remote destination port ,
Rule4: if trust value is medium and energy value is and which ends when the simulation stops. Average
medium then reliable value is medium. throughput can be calculated by dividing total number of
Rule5: if trust value is low and energy value is medium bytes received by the total end to end delay[23].
then reliable value is low.
Rule6: if trust value is anything and energy value is low
then reliable value is very low.
0.98
3.1.1. Route discovery procedure
380000
Finally , destination node sends a route reply(RREP)
340000
packet along the path which has a maximum reliable
300000 AODV
value.
260000 RRAF
220000
4. Simulation and results 180000
140000
The simulation environment is constructed by an 1500m 100000
300m rectangular simulation area and 50 nodes, 1 5 10 15 20
distributed over the area . Initial energy of a battery of speed(m/s)
each node is 4 Watts which is mapped to 100%.Simulation
Fig.5 Throughput at different speed.
results have been compared with AODV.Simulation study
has been performed for packet delivery ratio, throughput
and end to end delay evaluations.
Packet delivery ratio: The fraction of successfully received
packets, which survive while finding their destination.
This performance measure also determines the
compeletness and correctness of the routing protocols[23].
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 132
0.25 1
end to end delay(sec)
0.2 0.8
Fig.6 end to end delay at different speed. Fig.7 Packet delivery ratio at different pause time.
Delay
see that RRAF transmits and receives more data packets
RRAF
than AODV. This is because RRAF always chooses the 0.25
most stable route for transmission packets along the path
instead of choosing the shortest path. 0.2
References
[1] D. Remondo, Tutorial on wireless ad hoc networks,
Second International Conference in Performance Modeling
and Evaluation of heterogeneous networks, July 2004.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 133
[2] G. Aggelou, R. Tafazolli, RDMAR: a bandwidth-efficient [19] M.Tamilarasi, T.G Palani Velu, Integrated Energy-Aware
routing protocol for mobile ad hoc networks, Proceedings Mechanism for MANETs using On-demand Routing,
of the Second ACM International Workshop on Wireless International Journal of Computer, information, and
Mobile Multimedia (WoWMoM), August, 1999 pp. 2633. Systems Science, and Engineering 2;3 www.waset.org
[3] R. Dube, C.D. Rais, K.Y. Wang, S.K. Tripathi, Signal Summer 2008.
stability-based adaptive routing (SSA) for ad hoc mobile [20] M.Pushpalatha, Revathi Venkatraman, Security in Ad Hoc
networks, IEEE Personal Communications 4 (1997) 3645. Networks: An extension of Dynamic Source Routing, 10th
[4] D.B. Johnson, D.A. Maltz, Dynamic Source Routing in Ad IEEE Singapore International conference on
Hoc Wireless Networks, Kluwer, 1996. Communication Systems Oct 2006,ISBN No:1-4244-0411-
8,Pg1-5.
[5] V. Park, M.S. Corson, A highly adaptive distributed routing
algorithm [21] H. Wu1and C.n Shi1, A Trust Management Model for P2P
File Sharing System, International Conference on
for mobile wireless networks, Proceedings of the 1997 IEEE Multimedia and Ubiquitous Engineering, IEEE Explore 78-
INFOCOM, Kobe, Japan, April, 1997 pp. 14051413. 0-7695-3134-2/08, 2008.
[6] C.E. Perkins, E. Royer, Ad-hoc on-demand distance vector [22] G.Ghalavand,A.Dana,A.Ghalavand,andM.Rezahoseini,Reli
routing, Proceedings of the Second IEEE Workshop on able routing algorithm based on fuzzy logic for mobile
Mobile Computing Systems and Applications, New adhoc networks, International Conference on Advanced
Orleans, LA, USA, February, 1999 pp. 90100. Computer Theory and Engineering (ICACTE), 2010 .
[7] C.K. Toh, A novel distributed routing protocol to support [23] V.Rishiwal,A.Kush and S.Verma, Backbone nodes based
ad-hoc mobile computing, Proceedings of the fifteenth Stable routing for mobile ad hoc network,UBICC Journal,
IEEE Annual International Phoenix Conference on Vol2,No.3,2007,pp34-39.
Computers and Communications, March, 1996 pp. 480486.
[8] P. Jacquet, P. Muhlethaler, T. Clausen, A. Laouiti, A.
Qayyum, L. Viennot, Optimized link state routing protocol
for ad hoc networks, Proceedings of the 2001 IEEE INMIC,
December, 2001 pp. 6268.
[9] S. Murthy, J.J. Garcia-Luna-Aceves, A routing protocol for
packetradio networks, Proceedings of ACM First
International Conference on Mobile Computing and
Networking, Berkeley, CA, USA, November, 1995 pp. 86
95.
[10] S. Murthy, J.J. Garcia-Luna-Aceves, An efficient routing
protocol forwireless networks, ACM Mobile Networks and
Applications, Special Issue on Routing in Mobile
Communication Networks 1 (2) (1996) pp. 183197.
[11] G. Pei, M. Gerla, T.W. Chen, Fisheye state routing: a
routing scheme for ad hoc wireless networks, Proceedings
of the 2000 IEEE International Conference on
Communications (ICC), New Orleans, LA, June, 2000 pp.
7074.
[12] C.E. Perkins, P. Bhagwat, Highly dynamic destination
sequenceddistance-vector routing (DSDV) for mobile
computers, Proceedings of the ACM Special Interest Group
on Data Communication, London, UK, September, 1994 pp.
234244.
[13] M. Pushpalatha, R. Venkataraman, and T. Ramarao, Trust
based energy aware reliable reactive protocolin mobile ad
hoc networks, World Academy of Science, Engineering and
Technology 56 2009.
[14] N.-C. Wang , Y.-F. Huang , J.-C. Chen, A stable weight-
based on-demand routing protocol for mobile ad hoc
networks , Information Sciences 2007pp 55225537.
[15] N.-C. Wang ,S.-W.Chang , A reliable on-demand routing
protocol , Computer Communications 2005, pp 123135 .
[16] N.-C.Wang , C.-Y.Lee, A reliable QoS aware routing
protocol with slot assignment for mobile ad hoc network ,
Journal of Network and Computer Applications Vol. 32,
Issue 6, November 2009, Pages 1153-1166.
[17] R. Patil and A.Damodaram, Cost Based Power Aware
Cross Layer Routing Protocol For Manet, IJCSNS
International Journal of Computer Science and Network
Security, VOL.8 No.12, December 2008.
[18] G. Nishant and D. Samir, Energy-aware on-demand
routing for mobile Ad Hoc networks, Lecture notes in
computer science ISSN: 0302-743, Springer, International
workshop in Distributed Computing, 2002.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 134
2
Department of Computational Engineering and Networking, Amrita School of Engineering
Coimbatore, Tamilnadu, India
3
Department of computer science, Amrita School of Engineering
Bangalore, Karnataka, India
Abstract
Computational Visual System face complex processing capture the interested region within eye view and filter out
problems as there is a large amount of information to be the minor part of image. By means of visual attention,
processed and it is difficult to achieve higher efficiency in par checking for every detail in image is unnecessary due to
with human system. In order to reduce the complexity involved the property of selective processing. Computational Visual
in determining the saliency region, decomposition of image into
Attention (CVA) is an artificial intelligence for simulating
several parts based on specific location is done and decomposed
part is passed for higher level computations in determining the this biometric mechanism. With this mechanism, the
saliency region with assigning priority to the specific color in difference feature between region centre and surround
RGB model depending on application. These properties are would be emphasized and integrated in a conspicuity map.
interpreted from the user using the Natural Language Processing Given the complexity of natural language processing and
and then interfaced with vision using Language Perceptional computer vision, few researchers have attempted to
Translator (LPT). The model is designed for a robot to search a integrate them under one approach. Natural language can
specific object in a real time environment without compromising be used as a source of disambiguation in images since
the computational speed in determining the Most Salient natural language concepts guide the interpretation of what
Region.
humans can see. Interface between natural language and
vision is through a noun phrase recognition systems. A
Keywords: Visual Attention, Saliency, Language Perceptional
noun phrase recognition system is a system that given a
Translator, Vision.
noun phrase and an image is able to find an area in an
image where what the noun phrase refers to is located.
1. Introduction One of the main challenges in developing a noun phrase
recognition system is to transform noun phrases (low level
Visual attention is a mechanism in human perception of natural language description) in to conceptual units of a
which selects relevant regions from a scene and provides higher level of abstraction that are suitable for image
these regions for higher-level processing as object search. The goal is to understand how linguistic
recognition. This enables humans to act effectively in their information can be used to reduce the complexity of the
environment despite the complexity of perceivable sensor task of object recognition. However, integrating natural
data. Computational vision systems face the same problem language processing and vision might be useful for
as humans as there is a large amount of information to be solving individual tasks like resolving ambiguous
processed. To achieve computational efficiency, may be sentences through the use of visual information.
even in real-time Robotic applications, the order in which The various related works in the field of
a scene is investigated must be determined in an intelligent computational visual attention model are discussed in
way. The term attention is common in everyday language Section 2. Section 3 explains the system architecture and
and familiar to everyone. Visual attention is an important Language Processing model. The Section 4 gives the
biological mechanism which can rapidly help human to implementation details with analysis of the model
followed by conclusion in section 5.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 135
2. Related Work
The various models which identify the salient region are
analyzed in this section. Frintrop proposed a Visual
Attention System for Object Detection and Goal directed
search (VOCUS) [1]. Laurent Itti, Christof Koch and Ernst
Niebur [5] proposed an algorithm to identify the saliency
region in an image using linear filtering. The authors
describe in detail how the feature maps for intensity,
orientation, and colour are computed. All computations
are performed on image pyramids that enable the detection
of features on different scales. Additionally, they propose
a weighting function for the weighted combination of the
different feature maps by promoting feature maps with
few peaks and suppressing those with many ones. Simone
Frintrop, Maria Klodt and Erich Rome [6] proposed a
bottom-up approach algorithm for detection of region of
Fig. 1 Visual Attention Model with NLP.
interest (ROI) in a hierarchical way. The method involves
smart feature computation techniques based on integral
1) LPT: One of the main challenges in developing a noun
images without compromise on computational speed.
phrase recognition system is to transform noun phrases
Simone Frintrop, Gerriet Bracker and Erich Rome [2]
(low level of natural language description) into conceptual
proposed an algorithm where both top-down and bottom-
units of a higher level of abstraction that are suitable for
up approaches are combined in detection of ROI by
image search. That is, the challenge is to come up with a
enabling the weighting of features. The weights are
representation that mediates between noun phrases and
derived from both target and back ground properties. The low-level image input. The Parser processes the sentence
task is to build a map of the environment and to and it outputs the corresponding properties like location,
simultaneously stay localized within the map which serves Color, Size, Shape and for the Thing (object). We must
as visual landmarks for the Robot. Simone Frintrop and construct a grounded lexicon semantic memory that
Markus Kessel proposed a model for Most Salient Region includes perceptual knowledge about how to recognize the
tracking [10] and Ariadna Quattoni [3] has proposed a things that words refer to in the environment. A
model for detection of object using natural language grounded lexical semantic memory would therefore
processing, which is used in system discussed here. connect concepts to the physical world enabling machines
In psychophysics, top-down influences to use that knowledge for object recognition. A GLSM
are often investigated by so called cuing experiments. In (Grounded Lexical Semantic Memory) is a data-structure
these experiments, a cue directs the attention to the that stores know- ledge about words and their
target. Cues may have different characteristics: they may relationships. Since the goal of LPT is to transform a
indicate where the target will be, or what the target will noun-phrase into perceptual constraints that can be applied
be. A cue speeds up the search if it matches the target to visual stimuli to locate objects in an image. The outputs
exactly and slows down the search if it is invalid. of GLSM is given to the VAM at different processing
Deviations from the exact match slow down search speed, levels like location property at decomposition level, Color
although they lead to faster speed compared with a neutral property at Gaussian pyramid construction and Size and
cue or a semantic cue. This is the main motivation behind Shape property after detecting of salient region to
integrating the verbal cues to the attention model to identify the required object in an image.
enhance the search speed which is experimentally verified. 2) The Visual Attention model (VAM) identifies the
most attended region in the image. The following sections
3. System Architecture present the algorithm in detail.
The block diagram in Fig.1 describes the flow of the 3.1 Visual Attention Model
system. The system architecture describes two major The 1st level of bottom-up visual attention shown in fig.1
modules. 1) Language Perceptional Translator (LPT) [3] is decomposition of an image based on location property.
2) Visual Attention Model (VAM) [1, 4, 7, 8, 9]. We divided the image based on index method as shown in
Fig .2 as Top, Left, Right, etc.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 136
,C , S N ( R (c ) G (c ))( R ( s ) G ( s )) )
I
R r ( g b) / 2 (2)
C RG (7)
G g ( r b) / 2 ,C , S N ( B (c ) Y (c ))( B ( s ) Y ( s )) )
I
(3) C BG (8)
B b (r g ) / 2 (4) The feature maps are then combined into two conspicuity
maps, intensity I (9), color C (10), at the saliency maps
Y r g 2( r g b ) (5) scale (= 4). These maps are computed through across-
scale addition (), where each map is reduced to scale
Depending on color property cue from the GLSM the four and added point-by-point:
priority of which color is high and which is low is set on
different color channels of red(R), green (G), Blue (B),
4 c4
I N ( I (c, s )) (9)
and Yellow(Y). The Color opponent process is a color
theory that states that the human visual system interprets c 2s c 3
information about color by processing signals from cones 4 c4
and rods [in an antagonistic manner]. Opponency is C [ N ( RG (c, s ) N ( BY (c, s ))]
thought to reduce redundant information by de-correlating c 2s c 3
the photoreceptor signals. It suggests that there are three (10)
opponent channels Red Vs Green, Blue Vs Yellow, Dark The two conspicuity maps are then normalized and
Vs White. Response to one color of an opponent channel summed into the input S to the saliency map (11).
are antagonistic to those to the other color, i.e. one color
produces an excitatory effect and the other produces an S ( N ( I ) N (C )) (11)
inhibitory effect, the opponent colors are never perceived
The N(.) represents the non-linear Normalization operator.
at the same time (the visual system cant be
From the saliency map the most attention region is
simultaneously excited and inhibited).The decision on
identified by finding the maximum pixel value in the
which color channel to be used is based on the color cue.
salient region. The identification of the segmented region
The output of the feature maps are then fed to the center-
can be made based on size and shape property.
surround. These 5 channels are fed to the center surround
differences after resizing all the surround images to the
center image. Center-Surround operations are 4. Results and Analysis
implemented in the model as difference between a fine and
a coarse scale for a given feature. The center of the The system developed is tested on a dataset where the
receptive feature corresponds to the pixel at the level c attention object is a signboard. The various signs in the
{2, 3} in the pyramid and the surround corresponds to the dataset are bike, crossing and pedestrian symbols. The
pixel at the level s = c+1. Hence we compute three feature number of testing samples used for analysis is as shown in
maps in general case. One feature type encodes for on/off Table 1. The cues that are used in the dataset are the
image intensity contrast, two encodes for red/green and location cues, the color cue, the size and shape cue
blue/yellow double component channels. The intensity pertaining to the object signboard. In table 2 the verbal
feature type encodes for the modulus of image luminance cues that mostly suit for the chosen dataset is shown.
contrast. That is the absolute value of the difference
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 138
0.3.
4. Double the Red component and decrement Green,
Table 1: Testing samples for signboard detection
Type of Image Total No. of Images Blue and Yellow by a factor of 0.3.
Bike 16 For identifying the Blue color Signboards replace the Red
Crossing 16 color with Blue and Blue color with the Red and repeat the
Pedestrian 16 above 4 steps and the same as shown in Table 4.
Table 4: Testing sign board data set with different Priority levels.
Table 2: Cues for data set
Location Color Size Shape Thing
No. of Correctly detected images with color
Right top Red Large/Small Triangle Sign Images priority.
Corner board R_i& G_d R_i by R_i by 70% Double R &
Right top Blue Large/Small Rectangle Sign by 50% 70% & G_d &(G,B,Y)_d (G, B,Y)_d
Corner board by 30% by30% by 30%
Right top Blue Large/Small Circle Sign Crossing
Corner board Priority 6 4 10 12
RED
The analysis is done with and without cues. Visual color
attention model without cues has NxN i.e. N2 B_i&Y_d B_i by 70% B_i by 70% Double B &
computations at each level, where as with cues depending by 50% & Y_d & ( R,G,Y)_d
on Location Property the number of computations is by 30% &(R,G,Y)_d by 30%
reduced to N2/4 or N2/2 at each level to get Region of by 30%
Bike
Interest. Priority for color is chosen by trial and error
Priority 10 13 12 15
method with different combinations of inhibiting and BLUE
exhibiting channels. The system developed is tested under color
various cases scenarios like Pedestrian
a) No verbal cues are given to the system. Priority 10 12 13 14
b) Only the color property is obtained. BLUE
c) Only the location (region information color
available).
The Symbols R/B/G/Y_i indicates Red/Blue/Green/
d) Both color and location information.
Yellow color priority increased and R/G/B/Y_d indicates
VAM is tested and compared with the different Red/Green/Blue/Yellow color priority decreased.
combination of cues like only color, only location, both
To the VAM system the input Sign board image
color and location and without cues as shown in Table 3.
shown in Fig.4 (a) is given as input to the VAM and input
Table 3: VAM with different combinations of Cues. to the LPT is noun phrase which is Find the Red color
Total No. of images Detected with different Sign board on "Right_top_corner". So, here the desired
No. of combinations color cue is Red, location cue is Right_top_corner and the
Images No Only Only Both Color object is Sign board. The result of VAM is shown in Fig.4
Images Cues Color Location and
Location
(b) and when the color cue is Blue is shown in Fig.4(c).
The performance with different priority levels shown in
Bike 16 3 10 4 15 Table 4 and for the same color cue is shown in Fig (5).
Crossing 16 8 7 9 12
pedestrian 16 3 15 5 15
(b)
Fig.6 Performance with different sign board images and
with different types of priority color Cues.
combination of the verbal cues which will result in a and Electronics Engineering from Bharathiyar University,
Coimbatore, India in 1998 and M.E. Computer Sceince and
flexible architecture for visual attention has to be studied Engineering , Anna University, Chennai, India in 2003. Her
extensively with a language interface. research interests include image processing, computer vision and
soft computing.
6. References Dr. K.P Soman is the head, CEN, Amrita Vishwa Vidyapeetham
Amrita Vishwa Vidyapeetham, Ettimadai, Coimbatore-641105. His
[1] Frintrop, S. VOCUS: A Visual Attention System for qualifications include B.Sc. Engg. in Electrical engineering from
Object Detection and Goal directed Search. PhD REC, Calicut.P.M. Diploma in SQC and OR from ISI,
Calcutta.M.Tech (Reliability engineering) from IIT, KharagpurPhD
thesis Rheinische Friedrich-Wilhelms-University at (Reliability engineering) from IIT, Kharagpur.Dr. Soman held the
Bonn Germany (2005). Published 2006 in Lecture first rank and institute silver medal for M.Tech at IIT Kharagpur.
Notes in Artificial Intelligence (LNAI), Vol. 3899, His areas of research include optimization, data mining, signal and
image processing, neural networks, support vector machines,
Springer Verlag Berlin/ Heidelberg. cryptography and bio-informatics. He has over 55 papers in
[2] Frintrop, S., Backer, G. and Rome, E. Goal-directed national and international journals and proceedings. He has
Search with a Top-down Modulated Computational conducted various technical workshops in India and abroad.
Attention System. In: Proc. of the Annual meeting of
Padmakar Reddy.S is a Post graduate student in Amrita School
the German Association for Pattern Recognition of Engineering, Bangalore, Karnataka. His qualifications include
DAGM 2005 Lecture Notes in Computer Science B.Tech. in Electronics and Communication Engineering in
(LNCS) Springer (2005) 117124. Madanapalli Institute of Technology & Sciences, Madanapalli,
Andhra Pradesh, India. His research interests include image
[3] Ariadna Quattoni, Using Natural Language processing and Embedded Systems.
Descriptions to aid object Recognition. PhD thesis
University of Massachusetts, Amherst
Massachusetts, 2003.
[4] Frintrop, S., Jensfelt, P. and Christensen, H.
Attentional Landmark selection for Visual SLAM.
In: Proc. of the International Conference on
Intelligent Robots and Systems (IROS 06) (2006).
[5] Itti, L., Koch, C. and Niebur, E. A Model of
Saliency-Based Visual Attention for Rapid Scene
Analysis. IEEE Transactions on Pattern Analysis and
Machine Intelligence 20 (11, 1998) 12541259.
[6] Simon Frintrop, Maria Klodt, and Erich Rome. A
Real time Visual Attention System Using Integral
Images, in proc of the 5th international conference on
ICVS 2007, Bielefeld, Germany,March 2007.
[7] Wei-song Lin and Yu-Wei Huang. Intention-oriented
Computational Visual Attention Model for learning
and seeking image Content. Department of Electrical
engineering National Tiwan University . 2009 IEEE
Tranaction.
[8] Simone Frintrop, Patric jensfelt and Henrik
Christensen. Attentional Robot Localization and
Mapping at the ICVS Workshop on Computational
Attention and Applications, (WCAA), Bielefeld,
Germany, March 2007.
[9] Cairong Zhao,ChuanCai Liu, Zhihui Lai, Yue Sui,
and Zuoyong Li. Sparse Embedding Visual Attention
Model IEEE Transaction 2009.
[10] Simone Frintrop and Markus Kessel, Most Salient
Region Tracking, IEEE 2009 international
conference on Robotics and Automation (ICRA09),
Kobe, Japan, May 2009.
2
Computer Applications Department, Madhav Institute of Technology and Science
Gwalior, M.P. , India
3
Computer Applications Department, Samrat Ashok Technological Institute
Vidisha, M.P. , India
step. This is done because vertical mining offers usual algorithm. A justification with Example is given in Section
pruning of unrelated transactions as a result of an V. The investigational results and assessment show in section
intersection. Another characteristic of vertical mining is the VI. Finally Section VII contains the conclusions and
utilization of the autonomy of classes, where each frequent upcoming works
item is a class that contains a set of frequent k-itemsets
(where k > 1) [6]. The vertical arrangement appears to be a
usual choice for achieving association rule mining's purpose 2. Frequent Pattern Mining
of discovering associated items. Computing the supports of
itemsets is simpler and quicker with the vertical arrangement Frequent Itemset Mining came from efforts to determine
since it involves only the intersections of tid-lists or tid- valuable patterns in customers transaction databases. A
vectors, operations that are well-supported by the current customers transaction database is a series of transactions (T
database systems. In difference, complex hash-tree data = t1. . . tn), where each transaction is an itemset (ti I). An
structures and functions are required to perform the same itemset with k elements is known as k-itemset. In the rest of
function for flat layouts. There is an automatic reduction of the paper we make the (practical) assumption that the items
the database before each scan for those itemsets that are are from a prearranged set, and transactions are stored as
significant to the following scan of the mining process are sorted itemsets. The support of an itemset X in T, denoted as
accessed from disk. In the horizontal arrangement, however, suppT(X), is the number of those transactions that hold X,
irrelevant information that happens to be part of a row in i.e. suppT(X) = |{tj : X tj}|. An itemset is frequent if its
which useful information is present is also transferred from support is larger than a support threshold, originally denoted
disk to memory. This is because database reductions are by min supp. The frequent itemset mining problem is to
moderately hard to implement in the horizontal arrangement. discover all frequent itemset in a given transaction database.
Further, still if reductions were possible, the irrelevant The primary Algorithm Proposed for finding frequent
information can be removed only in the scan following the itemsets, is the APRIORI Algorithm [1]. This algorithm was
one in which its irrelevance is exposed. Therefore, there is enhanced later to obtain the frequent pattern quickly [2]. The
always a reduction delay of at least one scan in the horizontal Apriori algorithm employs the downhill closure propertyif
layout. an itemset is not frequent, any superset of it cannot be
A simple approach is if we generate pair of transaction frequent either. The Apriori Algorithm performs a breadth-
instead of item id where attributes are much larger then first search in the search Space by generating candidate k+1
transaction then result is very fast. Recently, different works itemsets from frequent k-itemsets. The occurrence of an
proposed a new way to mine patterns in transposed databases itemset is computed by counting its happening in each
where a database with thousands of attributes but only tens of transaction. Numerous variants of the Apriori algorithm have
objects [15]. In this case, mining the transaction pair runs been developed, like AprioriTid, AprioriHybrid, direct
through a smaller search space. None algorithm filters or hashing and pruning (DHP), Partition algorithm, dynamic
reduces the database in each pass of apriori algorithm to itemset counting (DIC) etc.[3] . FP-growth [4] is a well-
count the support of prune pattern candidate from database. known algorithm that uses the FP-tree data structure to get a
Most of the preceding work on vertical mining concentrates condensed representation of the database transactions and
on intersection of transaction [12]. This is based on employs a divide-and conquer approach to decompose the
intersection of perpendicular tid-vector where it is a set of mining problem into a set of smaller problems. In spirit, it
columns with each column storing an IID and a bit-vector of mines all the frequent itemsets by recursively determining all
1s and 0 to represent the occurrence or nonexistence, frequent 1-itemsets in the restrictive pattern base that is
respectively, of the item in the set of customer transactions. If proficiently constructed with the help of a node link
we use list-based layout then it takes much less space than structure. In algorithm FP-growth-based, recursive
the bit-vector approach (which has the overhead of openly production of the FP-tree affects the algorithms complexity.
representing absence) in sparse databases. We make the case Most of the preceding work on association mining has
in this paper and use list-based layout [16]. To find utilized the conventional horizontal transactional database
intersection we use dynamic technique instead of traditional arrangement. However, a number of vertical mining
approach. We suggest a novel dynamic algorithm for algorithms have been proposed recently for association
frequent pattern mining in which we generate transaction pair mining [5, 6, 9, 11, 12]. In a vertical database each item is
and for generating frequent pattern we find out by longest associated with its equivalent tidset, the set of all transactions
common subsequence using dynamic function. (or tids) where it appears. Mining algorithms using the
The rest of this paper is structured as follows. Section II vertical format have shown to be very valuable and usually
introduces the problem and reviews some efficient related do better than horizontal approaches. This advantage stems
works. The projected method is described in section III. from the fact that frequent patterns can be counted via tidset
Section IV explains in details the projected FPMDF intersections, instead of using complex internal data
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 143
structures (candidate generation and counting happens in a involves only the intersections of tid-lists or tid-vectors,
single step). The horizontal approach on the other hand needs operations that are well-supported by existing database
complex search/hash trees. Tidsets offer ordinary pruning of systems. In contrast, complex hash-tree data structures and
extraneous transactions as a result of an intersection (tids not functions are required to perform the same function for
relevant drop out). Furthermore, for databases with lengthy horizontal layouts. There is an automatic reduction of the
transactions it has been shown using a simple cost model, database before each scan in that only those itemsets that are
that the vertical approach reduces the number of I/O significant to the following scan of the mining process are
operations [7]. In a current study on the integration of accessed from disk. In the horizontal layout, however,
database and mining, the Vertical algorithm [8] was shown to irrelevant information that happens to be part of a row in
be the best approach (better than horizontal) when forcefully which useful information is present is also transferred from
integrating association mining with database systems. Eclat disk to memory. This is because database reductions are
[9] is the primary algorithm to find frequent patterns by a comparatively hard to implement in the horizontal
depth-first search and it has been shown to execute fine. arrangement. Further, even if reduction were promising, the
They use vertical database representation and count the irrelevant information can be removed only in the scan
support of itemset by using the intersection of tids. However, following the one in which its irrelevance is discovered.
pruning used in the Apriori algorithm is not applicable during Therefore, there is always a reduction delay of at least one
the candidate itemsets generation due to depth-first search. scan in the horizontal layout.
VIPER [5] uses the vertical database layout and the Most of the preceding work on vertical mining concentrates
intersection to accomplish a excellent performance. The only on intersection of transaction [12]. This is based on
difference is that they use the compacted bitmaps to represent intersection of perpendicular tid-vector where it is a set of
the transaction list of each itemset. However, their columns with each column storing an IID and a bit-vector of
compression method has limitations especially when tids are 1s and 0 to represent the occurrence or nonexistence,
uniformly distributed. Zaki and Gouda [10] developed a new respectively, of the item in the set of customer transactions. If
approach called dEclat using the vertical database we use list-based layout then it takes much less space than
representation. They store the difference of tids, called the bit-vector approach (which has the overhead of openly
diffset, between a candidate k-itemset and its prefix k -1 representing absence) in sparse databases. We make the case
frequent itemsets, instead of the tids intersection set, denoted in this paper and use list-based layout [16]. To find
here as tidset. They calculate the support by subtracting the intersection we use dynamic technique instead of traditional
cardinality of diffset from the support of its prefix k-1 approach. We suggest a novel dynamic algorithm for
frequent itemset. This algorithm has been exposed to gain frequent pattern mining in which we generate transaction pair
significant performance improvements over Eclat. However, and for generating frequent pattern we find out by longest
diffset will drop its advantage over tidset when the database common subsequence using dynamic function
is sparse.
Most of the preceding work on mining frequent patterns is
based on the horizontal illustration. However, recently a
number of vertical mining algorithms have been projected for 3. Dynamic Function
mining frequent itemsets. Mining algorithms using the
vertical representation have exposed to be effective and The longest common subsequence problem is one of the
usually do better than horizontal approaches [11]. This frequent problems which can be solved powerfully using
advantage stems from the fact that frequent patterns can be dynamic programming. The Longest common subsequence
counted via tidset intersections, instead of using complex problem is, we are given two sequences X=<x1,x2----------
internal data structures like the hash/search trees that the xn> and Y=<y1,y2---------ym> and wish to find a maximum
horizontal algorithms require [10]. The candidate generation length
and counting phases are done in a single step in vertical common subsequence of X and Y for example : if
mining. This is done because vertical mining offers ordinary X=<A,B,C,B,D,A,B> and Y=<B,D,C,A,B,A> then The
pruning of irrelevant transactions as a result of an sequence <B, C, B, A> longest common subsequence. Let us
intersection. define CC [i, j] to be the length of an LCS of the sequences
Another characteristic of vertical mining is the utilization of xi and yj. If either i=0 or j=0, one of the sequence has length
the autonomy of classes, where each frequent item is a class 0, so the LCS has length 0. The Optimal substructure of the
that contains a set of frequent k-itemsets (where k > 1) [6]. LCS Problem gives the recursive formula in fig.1
The vertical arrangement appears to be a natural choice for
achieving association rule mining's objective of discovering
correlated items. Computing the supports of itemsets is
simpler and faster with the vertical arrangement since it
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 144
TABLE V. L3
TABLE IV. C3
2,3
Department of Computer Science and Engineering, Anna University of Technology,
Tiruchirappalli, Tamil Nadu, India
algorithms devised for the purposes outlined above. Campos-Delgado et al. developed a fuzzy-based controller
Section IV illustrates conclusions. that incorporates expert knowledge to regulate the blood
glucose level.Magni and Bellazzi devised a stochastic
Decision support systems are defined as interactive model to extract variability from a self-monitoring blood
computer based systems intended to help decision makers sugar level time series [17].
utilize data and models in order to identify problems, solve Diaconis,P. & Efron,B. (1983) developed an expert system
problems and make decisions. They incorporate both data to classify hepatitis of a patient. They used Computer-
and models and they are designed to assist decision Intensive Methods in Statistics.
makers in semi-structured and unstructured decision Cestnik,G., Konenenko,I, & Bratko,I. designed a
making processes. They provide support for decision Knowledge-Elicitation Tool for Sophisticated Users in the
making, they do not replace it. The mission of decision diagnosis of hepatitis.
support systems is to improve effectiveness, rather than
the efficiency of decisions [19]. Chen argues that the use
of data mining helps institutions make critical decisions 3. Analysis and results
faster and with a greater degree of confidence. He believes
that the use of data mining lowers the uncertainty in
decision process [20]. Lavrac and Bohanec claim that the 3.1 About the Datasets
integration of dm can lead to the improved performance of
The Aim of the present study is the development and
DSS and can enable the tackling of new types of problems
evaluation of a Clinical Decision Support System for the
that have not been addressed before. They also argue that
treatment of patients with Heart Disease, diabetes and
the integration of data mining and decision support can
hepatitis. According to one survey, heart disease is the
significantly improve current approaches and create new
leading cause of death in the world every year. Just in the
approaches to problem solving, by enabling the fusion of
United States, almost 930,000 people die and its cost is
knowledge from experts and Knowledge extracted from
about 393.5 billion dollars. Heart disease, which is usually
data [19].
called coronary artery disease (CAD), is a broad term that
can refer to any condition that affects the heart. Many
CAD patients have symptoms such as chest pain (angina)
2. Overview of related work and fatigue, which occur when the heart isn't receiving
adequate oxygen. Nearly 50 percent of patients, however,
Up to now, several studies have been reported that have have no symptoms until a heart attack occurs.
focused on medical diagnosis. These studies have applied Diabetes mellitus is a chronic disease and a major
different approaches to the given problem and achieved public health challenge worldwide. According to the
high classification accuracies, of 77% or higher, using the International Diabetes Federation, there are currently 246
dataset taken from the UCI machine learning repository million diabetic people worldwide, and this number is
[1]. Here are some examples: expected to rise to 380 million by 2025. Furthermore, 3.8
Robert Detranos [6] experimental results showed correct million deaths are attributable to diabetes complications
classification accuracy of approximately 77% with a each year. It has been shown that 80% of type 2 diabetes
logistic-regression-derived discriminant function. complications can be prevented or delayed by early
The John Gennaris [7] CLASSIT conceptual clustering identification of people at risk. The American Diabetes
system achieved 78.9% accuracy on the Cleveland Association [2] categorizes diabetes into type-1 diabetes
database. [17], which is normally diagnosed in children and young
L. Ariel [8] used Fuzzy Support Vector Clustering to adults, and type-2 diabetes, i.e., the most common form of
identify heart disease. This algorithm applied a kernel diabetes that originates from a progressive insulin
induced metric to assign each piece of data and secretory defect so that the body does not produce
experimental results were obtained using a well known adequate insulin or the insulin does not affect the cells.
benchmark of heart disease. Either the fasting plasma glucose (FPG) or the 75-g oral
Ischemic -heart:-disease (IHD) -Support .Vector Machines glucose tolerance test (OGTT [19]) is generally
serve as excellent classifiers and predictors and can do so appropriate to screen diabetes or pre-diabetes.
with high accuracy. In this, tree based: classifier uses non- Hepatitis, a liver disorder requires continuous medical
linear proximal support vector machines.(PSVM). care and patient self-management education to prevent
Polat and Gunes [18] designed an expert system to acute complications and to decrease the risk of long-term
diagnose the diabetes disease based on principal complications. This is caused due to the condition of
component analysis. Polat et al. also developed a cascade anorexia (loss of appetite) and increased level of alkaline
learning system to diagnose the diabetes. phosphate. The disease can be classified in to Hepatitis a,
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 149
b, etc,. All these datasets used in this study are taken from 4 Steroid no, yes
UCI KDD Archive [1]. 5 Antivirals no, yes
6 Fatigue no, yes
3.2 Experimental Data 7 Malaise no, yes
algorithms can be examined by confusion matrix produced 0.719 0.314 0.742 0.719 0.73 0.719 Yes
by them. We employed four performance measures:
precision, recall, F-measure and ROC space [5]. A Table 6: confusion matrix of id3 algorithm- diabetes dataset
distinguished confusion matrix (sometimes called TP Rate FP Rate Precision Recall F-Measure ROC Area Class
contingency table) is obtained to calculate the four 0.582 0.154 0.67 0.582 0.623 0.767 Yes
measures. Confusion matrix is a matrix representation of 0.846 0.418 0.791 0.846 0.817 0.767 No
the classification results. It contains information about
actual and predicted classifications done by a classification
system. The cell which denotes the number of samples C4.5 Algorithm
classifies as true while they were true (i.e., TP), and the
cell that denotes the number of samples classified as false At each node of the tree, C4.5 [15] chooses one attribute
while they were actually false (i.e., TN). The other two of the data that most effectively splits its set of samples
cells denote the number of samples misclassified. into subsets enriched in one class or the other. Its criterion
Specifically, the cell denoting the number of samples is the normalized information gain (difference in entropy)
classified as false while they actually were true (i.e., FN), that results from choosing an attribute for splitting the
and the cell denoting the number of samples classified as data. The attribute with the highest normalized
true while they actually were false (i.e., FP). Once the information gain is chosen to make the decision. C4.5 [16]
confusion matrixes were constructed, the precision, recall, made a number of improvements to ID3. Some of these
F-measure are easily calculated as: are:
Recall= TP/ (TP+FN) (1) a. Handling both continuous and discrete attributes
Precision = TP/ (TP+FP) (2) creates a threshold and then splits the list into
F_measure = (2*TP)/ (2*TP+FP+FN) (3) those whose attribute value is above the threshold
Less formally, precision measures the percentage of the and those that are less than or equal to it.
actual patients (i.e. true positive) among the patients that b. Handling training data with missing attribute
got declared disease; recall measures the percentage of the values
actual patients that were discovered; F-measure balances c. Handling attributes with differing costs.
between precision and recall. A ROC (receiver operating d. Pruning trees after creation C4.5 [16] goes back
characteristic [5]) space is defined by false positive rate through the tree once its been created and
(FPR) and true positive rate (TPR) as x and y axes attempts to remove branches that do not help by
respectively, which depicts relative tradeoffs between true replacing them with leaf nodes.
positive and false positive. When the three medical datasets are run against the C4.5
TPR= TP/ (TP+FN) (4) algorithm and the results are indicated in the tables 7, 8, 9
FPR= FP/ (FP+TN) (5) respectively.
0.719 0.314 0.742 0.719 0.73 0.719 Yes 0.597 0.186 0.632 0.597 0.614 0.751 Yes
0.814 0.403 0.79 0.814 0.802 0.751 No
Table 5: Confusion matrix of id3 algorithm- hepatitis dataset
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.686 0.281 0.66 0.686 0.673 0.68 No
CART Algorithm
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 151
compared with C4.5, the run time complexity of CART is San Francisco, CA, 2007.
satisfactory. [5] H.W. Ian, E.F., "Data mining: Practical machine learning
tools and techniques," 2005: Morgan Kaufmann.
Table 13: Prediction accuracy table [6] R. Detrano, A.J., W. Steinbrunn, M. Pfisterer, J.J. Schmid, S.
S.No Name of algorithm Accuracy % Sandhu, K.H.Guppy, S. Lee, and V. Froelicher,
"International application of a new probability algorithm for
1 CART Algorithm 83.2
the diagnosis of coronary artery disease," American Journal
2 ID3 Algorithm 64.8 of Cardiology,1989. 64: p. 304-310.
3 C4.5 Algorithm 71.4 [7] G. John, "Models if incremental concept formation," Journal
of Atificial Intelligence, 1989: p. 11-61.
[8] A. L. Gamboa, M.G.M., J. M. Vargas, N. H. Gress, and R. E.
We have done this research and we have found Orozco, "Hybrid Fuzzy-SV Clustering for Heart Disease
83.184% accuracy with the CART algorithm which is Identification," in Proceedings of CIMCA-IAWTIC'06.
greater than previous research of ID3 and C4.5 as 2006.
indicated in the table XVIII. [9] D. Resul, T.I., S. Abdulkadir, "Effective diagnosis of heart
disease through neural networks ensembles," Elsevier, 2008.
[10] Z. Yao, P.L., L. Lei, and J. Yin, "R-C4.5 Decision tree
4. Conclusions modeland its applications to health care dataset, in
roceedings of the 2005 International Conference on Services
The decision-tree algorithm is one of the most effective Systems and Services Management," 2005. p. 1099-1103.
classification methods. The data will judge the efficiency [11] K. Gang, P.Y., S. Yong, C. Zhengxin, "Privacy-preserving
and correction rate of the algorithm. We used 10-fold data mining of medical data using data separation-based
techniques," Data science journal, 2007. 6.
cross validation to compute confusion matrix of each
[12] L. Cao, Introduction to Domain Driven Data Mining,
model and then evaluate the performance by using Data Mining for Business Applications, pp. 3-10, Springer,
precision, recall, F measure and ROC space. As expected, 2009.
bagging algorithms, especially CART, showed the best [13] Quinlan, J.R., "Induction of Decision Trees," Machine
performance among the tested methods. The results Learning. Vol. 1. 1986. 81-106.
showed here make clinical application more accessible, [14] L. Breiman, J. Friedman, R. Olshen, and C. Stone.
which will provide great advance in healing CAD, Classification and Regression Trees. Wadsworth Int. Group,
hepatitis and diabetes. The survey is made on the decision 1984.
tree algorithms ID3, C4.5 and CART towards their steps [15] S. R. Safavin and D. Landgrebe. A survey of decision tree
classifier methodology. IEEE Trans. on Systems, Man and
of processing data and Complexity of running data.
Cybernetics, 21(3):660-674, 1991.
Finally it can be concluded that between the three [16] Kusrini, Sri Hartati, Implementation of C4.5 algorithm to
algorithms, the CART algorithm performs better in evaluate the cancellation possibility of new student
performance of rules generated and accuracy. This showed applicants at stmik amikom yogyakarta. Proceedings of the
that the CART algorithm is better in induction and rules International Conference on Electrical Engineering and
generalization compared to ID3 algorithm and C4.5 Informatics Institut Technologic Bandung, Indonesia June
algorithm. Finally, the results are stored in the decision 17-19, 2007.
support repository. Since, the knowledge base is currently [17] P. Magni and R. Bellazzi, A stochastic model to assess the
focused on a narrow set of diseases. The approach has variability of blood glucose time series in diabetic patients
self-monitoring, IEEE Trans. Biomed. Eng., vol. 53, no. 6,
been validated through the case study, it is possible to
pp. 977985, Jun. 2006.
expand the scope of modeled medical knowledge. [18] K. Polat and S. Gunes, An expert system approach based
Furthermore, in order to improve decision support, on principal component analysis and adaptive neuro-fuzzy
interactions should be considered between the different inference system to diagnosis of diabetes disease, Dig.
medications that the patient is on. Signal Process., vol. 17, no. 4, pp. 702710, Jul. 2007.
[19] J.Friedman, Fitting functions to noisy data in high
References dimensions, in Proc.20th Symp. Interface Amer. Statistical
[1] UCI Machine Learning Repository .Assoc. , E.J.Wegman.D.T.Gantz, and I.J. Miller.Eds.1988
http://www.ics.uci.edu/~mlearn/MLRepository.html . pp.13-43
[2] American Diabetes Association, Standards of medical care [20] T.W.simpson, C.Clark and J.Grelbsh ,Analysis of support
in diabetes2007, Diabetes Care, vol. 30, no. 1, pp. S4 vector regression for appreciation of complex engineering
S41, 2007. analyses , presented as the ASME 2003.
[3] J. Du and C.X. Ling, Active Learning with Generalized [21] L. B. Goncalves, M. M. B. R. Vellasco, M. A. C. Pacheco,
Queries, Proc. Ninth IEEE Intl Conf. Data Mining, pp. and F. J. de Souza, Inverted hierarchical neuro-fuzzy BSP
120-128, 2009 system: A novel neuro-fuzzy model for pattern classification
[4] Jiawei Han and Micheline Kamber, Data Mining Concepts and rule extraction in LEE AND WANG: FUZZY EXPERT
and techniques, 2nd ed., Morgan Kaufmann Publishers, SYSTEM FOR DIABETES DECISION SUPPORT
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 153
Internet, social networking, Web 2.0, Facebook, YouTube, In Macedonia, the network and politics are still not
blog ... All these are relatively new word in the political together, except in the case of international organizations.
vocabulary, new concepts, new media and new Internet is not fully incorporated into political
opportunities for the transmission of ideas and messages communication (or, more precisely, it is not done
are not enough channels used to communicate with the properly). A key condition (requirement) for this is
public. Although the practice of using the Internet in local application of technology and simultaneous transformation
political advertising goes back to the nineties, only in of consciousness. This change requires the rejection of the
recent years the advent of new tools and social networks principle of confidentiality as a condition of political
demonstrates true strength of this medium. activity of government and party, because it is absolutely
contrary to the nature of the Internet. It is necessary also to
Besides direct access to the public, political ideas, it strengthen the awareness of the importance of on-line
provides full force confrontation, but also provides a crystallization of public opinion, and more intensive and
relatively convenient ground for review of public attitudes, better connection of on-line and off-line political stage.
research and development of certain ideas. Using such a
change in social communication, transmission of political
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 155
4. Impact of Social network analysis in government officials, candidates, parties, and citizens. As
politics history shows us, when new technologies are made
available, they begin to reshape the personalization factor
Political communication is a new and exciting area of between the candidate and the voter. This increase in
research and teaching that is located at the crossroads of interpersonal interactivity has shown to offer opportunities
the study of communication, political parties and electoral and increase success for political campaigns.
behavior. As well as profiling the changing nature of the
media system such an approach invariably leads us onto 5. Political Parties in Republic of Macedonia
what we term the new political communication - that 5.1. Overview of the political system
based around the new Information and Communication
Technologies (ICTs). We examine the work that has been Macedonia is a Republic having multi-party parliamentary
done on the uses of the new media by parties and democracy and a political system with strict division into
politicians across a range of democratic contexts and offer legislative, executive and judicial branches. From 1945
some insights into the strong challenges they introduce for Macedonia had been a sovereign Republic within Federal
the established manufacturers of political communication. Yugoslavia and on September 8, 1991, following the
One of the key uses of the Internet is to build databases of referendum of its citizens, Macedonia was proclaimed a
voter data and access that through different applications sovereign and independent state. The Constitution of the
for different purposes. Because data entry can be easily Republic of Macedonia was adopted on November 17,
done automatically by scanners or by hand more 1991, by the first multiparty parliament. The basic
campaigns and political operatives are recognizing the intention was to constitute Macedonia as a sovereign and
importance of capturing, storing, analyzing and using voter independent, civil and democratic state and also to create
information. What used to take days of analyzing can now an institutional framework for the development of
take minutes by using computers to analyze important parliamentary democracy, guaranteeing human rights, civil
information. That data can also be used offline or online liberties and national equality.
for a number of different ways and the usage of these
systems have become key components of the political The Assembly is the central and most important institution
system. of state authority. According to the Constitution it is a
representative body of the citizens and the legislative
Throughout history political campaigns have evolved power of the Republic is vested in it. The Assembly is
around the advancing technologies that are available to composed of 120 seats.
candidates. As technology develops, candidates are able to
permeate the lives of citizens on a daily basis. Television, The President of the Republic of Macedonia represents the
radio, newspapers, magazines, billboards, yard signs, Republic, and is Commander-in-Chief of the Armed
bumper stickers, and Internet websites all create a means Forces of Macedonia. He is elected in general and direct
of spreading political platforms. elections, for a term of five years, and two terms at most.
While the traditional forms of media are still an integral Executive power of the Republic of Macedonia is
portion of campaign strategy, the availability of the bicephalous and is divided between the Government and
Internet opens the door of campaign tools waiting for the President of the Republic. The Government is elected
candidates attention. The Internet provides numerous by the Assembly of the Republic of Macedonia by a
opportunities for politicians to reach the polity. Among majority vote of the total number of Representatives, and
those is a new phenomenon called social networking is accountable for its work to the Assembly. The
websites. Social networking sites have gained popularity organization and work of the Government is defined by a
in the last few years. These sites are growing popular law on the Government.
particularly on college campuses nationwide. Specifically
social networking websites such as MySpace and In accordance with its constitutional competencies,
Facebook have provided users with a new form of executive power is vested in the Government of the
communication. When new forms of communication are Republic of Macedonia. It is the highest institution of the
made available, political candidates begin to use the new state administration and has, among others, the following
technology to their advantage. What social networking responsibilities: it proposes laws, the budget of the
websites allow politicians to do is to create a sense of Republic and other regulations passed by the Assembly, it
personalized communication with their constituents. This determines the policies of execution of laws and other
personalization of politics enables voters and politicians regulations of the Assembly and is responsible for their
alike to feel as though a connection is made. The Internet execution, decides on the recognition of states and
can make direct communication possible among governments, establishes diplomatic and consular relations
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 156
6.3. Offered content Fig.5 Textual content of all political party web sites
Table shows that in terms of interaction, political parties Table 5: Visitors Counter on political web pages
are not handled and did not use the opportunities of new
media field. Besides basic information such as postal From the table we can conclude that almost 92% of
address, phone and email address, no other method is used. political parties have no counters on their websites, as a
Sites of some political parties have disabled the consequence of lack of counters we cannot say with
opportunity to contact them via e-mail or form, but they certainty about the attendance (visitors) of website of
offer only the traditional ways of communication certain political parties.
(telephone and letter).
Almost all political parties have used CMS (Content
For transparency of the website is necessary to enable Management System) for making their websites, so they
seeing the number of visitors on the site, which was also meet the basic rules for usability of the website. However
left out of more websites of the political parties. A very astonishing fact that despite meeting the technical
small part of the political parties had included counters on specifications for usability, they have errors that are not
their websites, whether public or just used by the inherent for the platforms that are used, for example the
administrators of the website. This means that political search box which does not work properly and the like. In
terms of recommendations for visibility of search engines,
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 160
Some differences between the American and Macedonian Authored by Compton, Jordan." All Academic Inc.
party systems should also be noted. A national party in the (Abstract Management, Conference Management and
USA and Macedonia is quite simply different entities. Research Search Engine). Web. 13 June 2010.
Population- and territorial size, as well as diversity, place <http://www.allacademic.com//meta/p_mla_apa_research_
different demands on local networking and autonomy, as citation/2/5/9/3/4/pages259348/p259348-1.php>.
well as effective coordination and communication between [4] All Academic Inc. (Abstract Management, Conference
the localities. American parties also have a much looser Management and Research Search Engine). Web. 13 June
structure, with relatively few members and dormant local 2010.
branches. Macedonian parties on the other hand are still <http://www.allacademic.com//meta/p_mla_apa_research_
are relatively strong organizations and less reliant on ad citation/2/5/9/3/4/pages259348/p259348-1.php>.
hoc networking. Thirdly, American elections are [5] "Alexa Internet - Website Information." Alexa the Web
candidate-centered, in contrast to the party centered Information Company. Web. 20 June 2010.
approach found in Macedonia. These differences may be <http://www.alexa.com/siteinfo>.
reduced over time, as Macedonia along with other [6] "What Is Web 2.0 - O'Reilly Media." O'Reilly Media -
European countries is approaching a model with Technology Books, Tech Conferences, IT Courses, News.
decoupled local branches, fewer members and more focus Web. 27 May 2010.
on individual leaders. But they are still significant enough <http://oreilly.com/web2/archive/what-is-web-20.html>.
to warrant the question whether Web 2.0 is more [7] Web 2.0 Sites. Web. 28 May 2010.
functional for American parties and therefore more <http://web2.ajaxprojects.com/>.
rational to use for winning elections, exactly because [8] "Key Differences between Web1.0 and Web2.0."
these parties are more like network parties in the first CiteSeerX. Web. 25 May 2010.
place. <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1
.145.3391>.
As such, it may therefore seem like a paradox that it is the [9] "Political Participation and Web 2.0." Political Theory.
SDSM and VMRO-DPMNE which have most fully Web. 24 May 2010. <http://arts.monash.edu.au/psi/news-
embraced Web 2.0. They are one of the oldest parties and and-events/apsa/refereed-papers/political-
probably still have the most effective and vital party theory/allisonpoliticalparticipationandweb.pdf>.
organization. However, this also means that the party has [10] "Alexa - Top Sites in Macedonia." Alexa the Web
the resources and structure to effectively implement their Information Company. Web. 01 June 2010.
Web 2.0 presence, provided the party leadership thinks it <http://www.alexa.com/topsites/countries/MK>.
necessary. It is another useful media channel for [11] "What Is Social Networking? - Should You Join."
communicating with members and voters. What Is Social Networking? - What Is Social Networking?
Web. 30 May 2010.
The Internet is a unique forum for politics as it provides <http://www.whatissocialnetworking.com/Should_You_Jo
back and forth communication and allows for an exchange in.html>.
of information between users and sources. The Internet [12] Boyd, Danah M., and Nicole B. Ellison. "Social
also offers its users greater access to information and the Network Sites: Definition, History. Web. 30 May 2010.
ability to express themselves in various online political <http://jcmc.indiana.edu/vol13/issue1/boyd.ellison.html>.
arenas. In addition, individuals use the Internet as a tool to
find and join groups that share their similar ideological, S. Emruli, received his bachelor degree from Faculty of
cultural, political and lifestyle preferences. Communication Sciences and Technologies in Tetovo SEE
University (2006), MSc degree from Faculty of
8. Reference Organization and Informatics, Varadin (2010). Currently
works as professional IPA Advisor at Ministry of Local
[1] Marin, Alexandra, and Barry Wellman. "Social Self Government in Macedonia.
Network Analysis: An Introduction." Computing in the
Humanities and Social Sciences. Web. 04 Apr. 2010. M. Baa, is currently an Associated professor, University
<http://www.chass.utoronto.ca/~wellman/publications/new of Zagreb, Faculty of Organization and Informatics. He is
bies/newbies.pdf>. a member of various professional societies and program
[2] Oblak, Tanja. "Internet Kao Medij i Normalizacija committee members, and he is reviewer of several
Kibernetskog Prostora." Hrak Portal Znanstvenih international journals and conferences. He is also the head
asopisa Republike Hrvatske. Web. 06 Apr. 2010. of the Biometrics centre in Varadin, Croatia. He is author
<hrcak.srce.hr/file/36810>. or co-author more than 70 scientific and professional
[3] "Mixing Friends with Politics: A Functional Analysis papers and two books.
of 08 Presidential Candidates Social Networking Profiles
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 164
2
Schools of Computer Science & IT,
Devi AhilyaVishwavidyalaya,
Indore, MP, India,
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 165
community was not satisfied by UML, and created a new knowledge of the requirement is hard to understand [17,
Business Process Modelling Notation (BPMN), which has 18]. The lack of framework for guiding requirements
become OMG standard as well. In many cases, they also models is one of the main issues. In academic community,
apply Integration Definition for Function Modeling researchers propose many detailed and focused
(IDEF) notations [12, 13]. In domain analysis, analysts requirements development methods [20, 21]. However,
continue to apply old-style Entity Relationship (ER) most of these methods resulting from academic research
notation, which was popular in database design since 70s are too complex for practical application and solve just
[141]. A significant attention is paid to business goals, specific specialized issues. A simple and adaptable
business rules, business object lifecycles, business roles framework for requirements modelling with demonstrated
and processes in organization, which also can be done examples are created using available tools on a realistic
using UML [15, 16]. case study gives much more value for practitioners.
Real-time and embedded system developers have also We have proposed requirements modelling framework
come up with a different flavour of UML System using UML concepts for model-driven software
Modelling Language (SysML). It defines requirements development, which is shown in Figure 1. This framework
diagram and enables capturing various non-functional and consists of five major phases, namely; feasibility study,
detailed functional requirements [17]. Also, it establishes requirement collection and specification, analysis of
specific links between requirements and other elements. business requirements, system requirement modelling and
Most popular requirements text books introduce various system design. Further, analysis of business requirements
diagrams based on both UML and other informal includes business conception and association, business
notations, e.g. system context diagram, and hand-drawn object life cycle and business tasks and methods and
user interface prototypes [11, 18]. The mentioned system requirement modelling incorporates actors, use
requirements artefacts can be modelled using UML. Since cases and their scenario. The following subsections will
UML is a general purpose modelling language with more discussed each phases of the proposed framework with the
than 100 modelling elements (UML meta classes) and help of UML diagram and using examples.
without standardized method, practitioners apply it only
fragmentally, and at the same time, they do not make use Requirements modelling Framework
of its powerful capabilities to define consistent, integrated, Feasibility
focus on the details of a specific part of the framework by Analysis of Business Business objective life cycle
Requirements
applying UML concepts for requirements modelling. Business tasks and methods
Modellers System
comments Requirements
Actors
Most requirement documents are written in natural
System Requirement
languages and represented in less structured and imprecise Analysis
Modelling
and
Use cases
formats. Including requirement phase, artifacts created in Designer Use case scenario/ Examples
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 166
objectively and rationally uncover the strengths and that users will have with the system/software [2]. In
weaknesses of the existing business or proposed venture, addition to use cases, the SRS also contains non-functional
opportunities and threats as presented by the environment, requirements. Non-functional requirements are
the resources required to carry through, and ultimately the requirements which impose constraints on the design or
prospects for success. In its simplest term, the two criteria implementation. SRS is a comprehensive description of
to judge feasibility are cost required and value to be the intended purpose and environment for software under
attained. As such, a well-designed feasibility study should development. The SRS fully describes what the software
provide a historical background of the business or project, will do and how it will be expected to perform. An SRS
description of the product or service, accounting minimizes the time and effort required by developers to
statements, details of the operations and management, achieve desired goals and also minimizes the development
marketing research and policies, financial data, legal cost. A good SRS defines how an application will interact
requirements and tax obligations. Generally, feasibility with system hardware, other programs and users in a wide
studies precede technical development and project variety of real-world situations. Parameters such as
implementation. operating speed, response time, availability, portability,
maintainability, footprint, security and speed of recovery
2.2 Requirement elicitation, collection and from adverse events are evaluated in SRS.
specification
2.3 Analysis of business requirements
Requirement elicitation and development phase mainly
focuses on examining and gathering desired requirements Many organizations already have established their
and objectives for the system from different viewpoints procedures and methodologies for conducting business
(e.g., customer, users, constraints, system's operating requirements analysis, which may have been optimized
environment, trade, marketing and standard etc.). specifically for the business organization. However, the
Requirements elicitation phase begins with identifying main activities for analysing business requirements are
stakeholders of the system and collecting raw identifying business conception and association,
requirements from various viewpoints. Raw requirements determining business object life cycle, and identifying
are requirements that have not been analysed and have not business tasks and methods. If these exist, we can use
yet been written down in a well-formed requirement them. However, we must follow the following factors to
notation. The elicitation phase aims to collect various create requirement models:
viewpoints such as business requirements, customer
requirements, user requirements, constraints, security (A) Identification of key stakeholders- The first step toward
requirements, information requirements, standards etc. the requirement analysis and collection is Identification of
the key people who will be affected by the project. Such
Typically, the specification of system requirements starts as, project's sponsor responsible users and clients. This
with observing and interviewing people [1, 2, 3]. may be an internal or external client. Then, identify the
Furthermore, user requirements are often misunderstood end users, who will use the solution, product, or service.
because the system analyst may misinterpret the users Our project is intended to meet their needs, so we must
needs. In addition to requirements gathering, standards and consider their inputs.
constraints are also play an important role in systems
development. The development of requirements may be (B) Capture stakeholder req uirements- Another approach
contextual. It is observed that requirement engineering is a towards analysis of business requirement is capturing the
process of collecting requirements from customer and requirement from stakeholders. In this approach, the
environment in a systematic manner. The system analyst requirement engineer requests stakeholders or groups of
collects raw requirements and then performs detailed stakeholders for their requirements from various sources
analysis and receives feedbacks. Thereafter, these for the new product or service.
outcomes are compared with the technicality of the system
and produce the good and necessary requirements for (C) Categorize requirements- Requirements can be
software development [3]. classified into four categorized to make analysis easier for
software design:
Requirements requirement specification (SRS) document Functional requirements (FR) FR defines how a
is produced after the successful identification of product/service/solution should function from the
requirements. It describes the product to be delivered end-user's perspective. They describe the features
rather than the process of its development. Also, it and functions with which the end-user will interact
includes a set of use cases that describe all the interactions directly.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 167
Operational requirements (OR) OR operations detailed information, associations with names and role
that must be carried out in the background to keep multiplicities. Such models are discussed by business
the product or process functioning over a period of analysts and domain experts who are usually not familiar
time. with object-oriented analysis and design.
Technical requirements (TCR) TCR defines the
technical issues that must be considered to Therefore, it is very important that all the other elements
successfully implement the process or create the of the model, such as aggregations, compositions,
product. generalizations, interfaces, enumerations, etc., should not
be used for conceptual analysis. Keeping it simple enables
Transitional requirements (TSR) TSRs are the even UML novices to understanding it after getting a little
steps needed to implement the new product or explanation. Additionally, we can provide textual
process smoothly. TSR is indicates that how the descriptions for each of these concepts and generate
requirements are behave as the consequence of printable or navigable domain vocabularies. We believe
external requirements this should be the first artifact since it sets up the
vocabulary, which should be used for defining other
(D) Interpret and record requirements- Once we have requirement model elements, cases, etc.
gathered and categorized all requirements determine which
requirements are achievable, and how the system or Agreed to
Business conception and association: Different Organizations have business rules for managing business
methodologists have been proposed by various researchers objects. In many cases, business rules regulate how
for business conception and association techniques but still important business objects change states and are
disagree on beginning of business information systems applicable only when object is in a particular state.
development [10]. In our proposed research, the starting Requirement modelling is one of the important tools to
point should be business concept analysis and analysis and understand these changes. The states also serve as a part of
their relationships which are shown in Figure 2. For this terminology, which will be used in other business and
purpose we can apply simple organisational working requirements models. State machine diagrams should be
model using only classes with names and without more created only for those business concepts that have dynamic
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 168
In business modelling for transition triggers, most people Make request Product
use informal signals that in most cases correspond to Reservation
Available Not available
One week
product
Contact librarian in a
supplier is assigned by a unique id to them and after that Product Product return
Issue
In Figure 4, we are showing inventory processes with the
Issued
Damaged
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 169
automation or refactoring [10]. For software developers it Figure 5: Issueing the product for supplier
is important to know which parts in target business
processes the software system should implement or Use case and Use case scenario: A use case in software
support. engineering and systems engineering is a description of a
potential series of interactions between a software module
2.4 System requirements modelling using case study and an external agent, which lead the agent towards
something useful. A use case diagram in the UML is a
Requirement modelling is an important activity in the type of behavioral diagram defined by and created from a
process of designing and managing enterprise Use-case analysis. The purpose of use case is to present a
architectures. Requirements modelling helps to graphical overview of the functionality provided by a
understand, structure and analyse the way business system in terms of actors, their goals and any
requirements are related to Information Technology dependencies between those use cases. Also, it is useful to
requirements, and vice versa, thereby facilitating the show what system functions are performed for which
business-IT alignment. It includes actors, use cases and actor.
use case scenario. Each of these is further describe in
following subsection: Requirement models are used to captures only
functionality that the end-user needs from the system. The
Actors: An actor is a user or external system with which a other requirements such as non-functional requirements or
system being modelled interacts. For example, in our detailed functional requirements are not captured in
inventory management system involves various types of standard requirement modelling diagrams. The simplest
users, including supplier, inventory management system, way is to describe those in simple textual format and
human resources, and manufacturer. These all users are include references to use cases, their scenarios, etc.
actors. At the same time, an actor is external to a system Another approach is to create specific requirements
that interacts with the system. An actor may be a human modelling.
user or another system, and has some goals and
responsibilities to satisfy in interacting with the system. Librarian Inventory management System
It is also necessary to generate actor who giving compact Identify supplier Get issued product
requirement specification model that incorporates the Get issued loan details
Confirm return
package details diagram, showing package use cases, their Over due
associations with actors and relationships between use On time Penalty to supplier
More items
activities should be nested within appropriate use cases No more
items
and assigned as their behaviours. And finally we describe
use cases according to pre-defined templates, e.g. rational
unified process use case document, actors in Figure 5. Figure 6: Register product return
Issue the Book For example, introduce stereotypes for each important
Product info
Find product
system requirement type with tags consisting requirement specific
Supplier Include
information and define types of links for tracing
Make request
requirements, such as derive, satisfy, support. Another
Make the reservation Time aspect on which system analysts work in some projects is
Inventory System definition of data structure. It can be done using
Review supplier profile Extend
(Waiting reservation) conventional requirement modelling diagrams. If
Notify necessary, object diagrams can also be used for defining
Register issued product about
Manufacturer
availability samples for explanation or testing of data structure defined
Register product return
Extend
(Waiting reservation)
in class diagrams. Since the focus here is on data structure,
class operations compartments can be hidden in the
Extend
Due to Overdue diagram (Figure 6).
Penalize supplier
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 170
Comparing to conceptual analysis, more elements are used Figure 8: User interface diagram model
here, such as attributes and association end specifications,
enumerations, and generalization. Although such model is Finally, we emphasize that the requirements analysis work
considered to be part of design, in practice quite often it is should be iterative and incremental. Also, the ordering of
created and maintained by system analysts. For data- modelling tasks might be different based on taken
centric applications, it is very important to do data-flow approach, or some steps might be omitted.
diagrams showing information flows between different
classifiers, e.g. system-context diagram indicates 2.5 System design
information flows from system to outside world entities,
i.e. actors or external systems that need to be integrated. After the successful completion of system requirement and
modelling phase, the draft (raw) requirement may be
Issue provided to the design team. Design team check the
Reservation
Supplier Products validity of these draft requirements and starts to design the
specification
Request
Item system or software model. Basically, system design is the
Notification Searc process of designing developing and implementation of
Inventory the proposed system as per the requirement obtained
system Product during the analysis of existing system. The main objective
info System of the system design is to develop the best possible design
Specification
Issue Product as per the requirements from users and working
Product Issue
Supplier environment for operating the information system. It is the
Category
Manufacture process of defining the architecture, components, modules,
interfaces, and data for a system to satisfy specified
Figure 7: Information flow model
requirements. Systems design is therefore the process of
defining and developing systems to satisfy specified
The previous requirements modelling artifact for which requirements of the user. Object-oriented analysis and
design methods are becoming the most widely used
system analyst might be responsible is user interface
prototypes. The prototype itself can theoretically be methods for computer systems design. The UML has
mapped to UML Composite Structure diagram. However, become the standard language in object-oriented analysis
and design. It is widely used for modelling software
when focusing on separate screen prototypes, people
sometimes loose the screens which can be used by each systems and is increasingly used for high designing non-
actor, and the possibilities to navigate from each screen to software systems and organizations.
the other screens. For capturing this information, we can
create GUI navigation map, which is shown in Figure 7. In After the designing of the system model, designer
Figure 7, we use state diagram, where each state represents evaluates the efficiency of the design model. If any
modification is remaining in the model, designer again
a screen, in which user is at the moment, and transition
triggers represent GUI events, such as mouse double-click checks the validity of requirements and asks for correction
or clicking on some button, Using this requirement model, with comments. The process will stopped until the clear
cut clarification is not received by the design team. This
system developers create an effective software on
inventory control and management system. The user section is very important because according to the
interface diagram model is shown in Figure 8. software engineering approach the design is the bridge the
Home gap between requirement analysis and coding of the final
Login
Issue details
software development
Issuing detail
Supplier Profile
Entry the Supplier
Home
Reservation details
Browse
Cancel reservation
The paper discusses implementation of requirement
Browse
modelling for various requirements analysis purposes and
Product Browse
mapping of conventional requirements artifacts into
system elements. We have also presented some modelling
Search Multiple Select category/ Refresh
Matches
Titles aspects, which are necessary for ensuring that the
On - match
Browse requirements elements that are mapped to the same UML
Get detail
Make reservation element can be differentiated. We can also find critics on
Product Detail
using UML as requirements specification language, most
of the issues can be solved using UML tool with rich
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 171
possibilities for modelling environment customization and [10] I. Jacobson, Object-Oriented Software Engineering. Addison
Wesley Professional, 1992.
extensions [18]. On the other hand, there are also [11] K. Wiegers. Software Requirements. 2nd edition, Microsoft Press,
suggestions to use more UML for requirements analysis 2005.
and visualizations [20]. Multiple authors provide [12] Object Management Group. Business Process Modelling Notation
numerous papers on more detailed approaches to Specification. Final Adopted Specification, version 1.0, 2006.
[13] O. Noran. UML vs. IDEF: An Ontology-oriented Comparative
customizing unified modelling language for specific Study in View of Business Modelling. Proceedings of International
requirements modelling needs, such as analyzing Conference on Enterprise Information Systems, ICEIS 2004, Porto,
scenarios, modelling user interface prototypes, refining 2004.
requirements [21, 22]. Some researchers also suggest that [14] P. Chen, P.-S. The entity-relationship model toward a unified
view of data. ACM Transactions on Database Systems (TODS),
UML can be specialized for early phase requirements vol. 1 (1), 1976.
gathering process but the proposed framework emphasizes [15] Van Lamsweerde, A. Goal-Oriented Requirements Engineering: A
that early phase modelling should focus on same types of Guided Tour. RE'01 International Joint Conference on
artifacts with less detail. Requirements Engineering, Toronto, 2001, pp.249-263.
[16] M. Penker , H. Eriksson, E. Business Modelling With UML:
Business Patterns at Work. Wiley, 2000.
[17] Object Management Group. Systems Modelling Language. Formal
4. Conclusions Specification, version 1.0, 2007.
[18] E. Gottesdiener, The Software Requirements Memory Jogger: A
In this paper, we have discussed the major requirements Pocket Guide to Help Software and Business Teams Develop and
Manage Requirements. GOAL/QPC, 2005.
artifacts described in requirements engineering literature [19] M. Glinz, Problems and Deficiencies of UML as a Requirements
can easily be mapped to elements of UML. Also, we have Specification Language. 10th International Workshop on Software
depicted a conceptual framework for requirements Specification and Design, 2000, p.11 - 22
modelling with illustrated examples for inventory control [20] S. Konrad, H. Goldsby, K. Lopez, Visualizing Requirements in
UML Models. International Workshop REV06: Requirements
and management system. Our future research work will Engineering Visualization, 2006.
focus on more detailed management for requirements [21] H. Behrens, Requirements Analysis and Prototyping using
modelling framework and development of different demo Scenarios and Statecharts. Proceedings of ICSE 2002 Workshop:
version for different management system. Scenarios and State Machines: Models, Algorithms, and Tools,
2002.
[22] Da Pinheiro, P. Silva, The Unified Modelling Language for
References Interactive Applications. Evans A.; Kent S.; Selic B. (Eds.): UML
[1] D. Pandey, U. Suman, A. K. Ramani, Social-Organizational 2000 The Unified Modelling Language. Advancing the Standard,
Participation difficulties in Requirement Engineering Process- A pp. 117-132, Springer Verlag, 2000.
Study, National Conference on ETSE & IT, Gwalior Engineering
College, Gwalior,2009.
[2] Dhirendra Pandey, U. Suman, A.K. Ramani, Design and
Development of Requirements Specification Documents for
Making Quality Software Products, National Conference on ICIS, Dhirendra Pandey is a member of IEEE and IEEE Computer
D.P. Vipra College, Bilaspur, 2009. Society. He is working in Babasaheb Bimrao Ambedkar University,
[3] Dhirendra Pandey, U. Suman, A.K. Ramani , An Effective Lucknow as Assistant Professor in the Department of Information
Requirement Engineering Process Model for Software Technology. He has received his MPhil Degree in Computer
Development and Requirements Management, IEEE Xplore, 2010, Science from Madurai Kamraj University, Madurai, Tamilnadu,
Pp 287-291 India. Presently, he is perusing PhD in Computer Science from
[4] M. Broy, I. Kruger, A. Pretschner and C. Salzmann. Engineering School of Computer Science & Information Technology, Devi
Automotive Software. Proceedings of THE IEEE. 95(2): 356-373, Ahilya University, Indore (MP).
Febrary 2007.
[5] D. Rubinstein Standish Group Report: Theres Less Development Dr. Ugrasen Suman has received his PhD degree from School of
Chaos Today. SD Times, March 1, 2007. Computer Science & Information Technology (SCSIT), DAVV,
[6] J. Aranda , S. Easterbrook , G. Wilson Requirements in the wild: Indore. Presently, he is a Reader in SCSIT, Devi Ahilya University,
How small companies do it. 15th IEEE International Requirements Indore (MP). Dr. Suman is engaged in executing different research
Engineering Conference (RE 2007), pp. 39-48. project in SCSIT. He has authored more than 30 research papers.
[7] M. Panis, B. Pokrzywa, Deploying a System-wide Requirements
Process within a Commercial Engineering Organization. 15th IEEE Professor (Dr.) A. K. Ramani has received his ME and PhD
Degree from Devi Ahilya Vishwavidyalaya, Indore (M.P.). Dr.
International Requirements Engineering Conference (RE 2007), pp.
Ramani has authored more than 100 research papers and
295- 300.
executing several major research projects. Presently, he is the
[8] Object Management Group. Unified Modelling Language:
Head of the Department in SCSIT, Devi Ahilya University, Indore
Superstructure. Formal Specification, 15th IEEE International
(MP).
Requirements Engineering Conference (RE 2007), 2007.
[9] G. Engels., R. Heckel, and S. Sauer, UML A Universal Modelling
Language? In M. Nielsen, D. Simpson (Eds.): ICATPN2000, LNCS
1825, pp. 24-38, 2000.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 172
4.1 Measures between shapes like angular and rounded objects [23].
Area convexity index and Volume convexity index tell us
To compute 3D shape indexes, we directly compute 3D about the shape of the object, but it is difficult to identify
measures on the 3D model or transforming 2D measures. any shape from these 3D shape indexes. Therefore, it is
The most important 3D measures are surface area and necessary to use a set of 3D shape indexes and combine
volume. With 3D polygonal model representation, we can them to retrieval the 3D model. These 3D shape indexes
compute these measures [5] as follow: should be calculated very quickly and interpret the results.
N Basically, shape index has two types; compactness-based
(V
1
area i ,1 Vi ,0 ) (Vi , 2 Vi ,0 ) (1) and boundary-based shape indexes.
2 i Various compactness measures are used. For this reason,
N an early attempt to develop the compactness index is based
(V
1
Volume x y z
i , 2Vi ,1Vi ,0 Vi ,x1Vi ,y2Vi ,z0 Vi ,x2Vi ,y0Vi ,z1 on the values of perimeter and area. These 2D measures
6 i (2) allow calculating the Isoperimetric shape index as follows:
Vi ,x0Vi ,y2Vi ,z1 Vi ,x1Vi ,y0Vi ,z2 Vi ,x0Vi ,y1Vi ,z2 ) 4S
(5)
P2
V is a vector containing the coordinates of the vertices of P and S are respectively the perimeter and surface of shape.
the triangle i. This 2D shape index, defined between 0 and 1, is based on
These measures are used directly for calculating 3D shape the surface to the perimeter ratio and reaches the value
indexes without transforming 2D measures. For other 3D unity for a disk. We can also calculate the 2D circularity
measures, the 2D measures are used. For example, to index shape as follows:
calculate the radii, we use the distance between the 4S
centroid and a point on the surface area instead of the 1 (6)
distance between the centroid and a point on the perimeter. P2
There are other measures, which are dimensionless and In 3D models, the perimeter becomes the surface area, and
shape indexes like a number of holes. In practice, we used the surface becomes the volume. A ratio between surface
the following measures: Volume, Surface area, Ferret area and volume is commonly used in the literature to
diameter, Small and large radii, main axis and plan. In fact, compute compactness of 3D shapes. With this ratio an
the principal component analysis method is employed and IsoSurfacic shape index can be obtained as follows:
three sets of main axes and planes are obtained. Ferret V 1/ 3
Is 6 (7)
diameter is the longest distance from two contour points of A1 / 2
the 3D object. These measures are used as semantic V and A are respectively the volume and surface area of
concepts in ontology and allow to define the spatial the 3D model.
relationships. We consider that each measure is the entity. IsoSurfacic shape index is a compactness indicator which
describes the form based on the surface area-to-volume
4.2 Shape indexes ratio. Sphericity is another specific shape index for
indicating compactness of a shape. It is a measure of how
From these basic measures, one can calculate the 3D shape
spherical an object is. It can be also calculated from
indexes. Surface area (1) and volume (2) may be used as
surface area and volume 3D measures (8). The Sphericity
measures for calculating 3D shape indexes like VC (3) and
(S) is maximum and equal to one for a sphere.
AC (4), which can be considered as the basic descriptors
1 / 3 (6V ) 2 / 3
of shape. S (8)
V A
VC (3) The Sphericity index shape (S) is very fast in computing.
V (C H )
However, it is unsuited as a parameter of elongation. The
A(C H )
AC (4) latter is defined as quality of being elongated. The
A elongation, in this paper, is the boundary based and can be
measured as the ratio of the smallest radius on the greatest
V and A are respectively the 3D model volume and radius (9) or ratio major on minor axes called Eccentricity.
surface area. CH is a convex hull that is the minimum R
enveloping boundary. E min (9)
Rmax
AC and VC (called Area convexity index, Crumpliness
[27] or Rectangularity and Volume convexity index) are The ratio of the maximum Ferret diameter and the
easy to compute and are very robust with respect to noise minimum Ferret diameter is also used as the elongation
[1]. Moreover, these shape indexes can distinguish parameter. We have included two aspect ratios of the
bounding box for a 3D model in our system due to the
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 175
6. Ontology
5. Clustering-based semantic
Ontology is a set of concepts and useful relations to
Although the shape indexes calculated to provide global describe a domain, and thus makes more explicit the
information on the 3D model and contain compactness and implicit semantics of models. One advantage of shape
elongated indicators, the problems connected with 3D indexes is its flexibility to create other shape indexes for
model retrieval are not still resolved. The first one regards each model to be indexed in a domain-specific. In this
the 3D shape indexes: they are insufficient to describe the paper, ontology is employed to allow the user to query a
3D model in a generic 3D database; although these are generic 3D collection, where no domain-specific
relevant. Therefore, the necessity to combine several 3D knowledge can be employed, using the 3D model as
shape indexes to augment our knowledge base with query. The Ontology has been used to organize semantic
semantic concepts using, in our case, the ontology and concepts that are defined by the k-mean algorithm (e.g.
spatial relationships. Second problem is caused by the Sphericity, elongation, convexity...). It includes other
semantic gap between the lower and higher level features. concepts such as semantic entities (e.g. lines, points,
To reduce this semantic gap we use machine learning surface, and plan), a set of spatial relations and some
methods to associate shape indexes with semantic axioms (transitivity, reflexivity, symmetry). The proposed
concepts and ontology to define these semantic concepts ontology is represented in Ontology Language OWL [19],
as shown in fig. 4. In this paper, 3D shape indexes are is the W3C recommended standard for ontology that
used to represent visual concepts [22] of a 3D object. precise formal semantics. As shown in Fig. 5, the OWL is
structured into two parts: The first part contains shape
index concepts and regroups the descriptors into classes
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 176
7. Spatial relationships
Shape indexes calculated are globally characterized the
shape. Without segmenting the model, we calculated the
local characteristics using spatial relationships that are
usually defined according to the location of the measure in
the 3D model. In our method, spatial relationships are
defined by measures or entities that can increase the
quality of detection and recognition of the model content
and can disambiguate among models of similar appearance
including for example the meaning of orientation and
respect the distances. Therefore, other concepts are added
Fig. 6 The partial hierarchy of domain concepts of geometry to the 3D shape indexes to describe position, distances and
orientation of an entity in the 3D model. There are various
The structure of the ontology is represented in OWL as entities that need spatial relationships to describe 3D
follows: model to represent correctly the 3D models content. In this
<owl:Class rdf:about="http://www.exemple/ontologie#Mesures"> paper, the following relationships are described (Fig. 7):
<rdfs:subClassOf> - Metric (distance, area...)
<owl:Class
- Orientation (near of, left of ...)
rdf:about="http://www.exemple/ontologie#Modle3D"/>
</rdfs:subClassOf> - Topology (Inclusion, adjacent ...).
</owl:Class>
<owl:Class
rdf:about="http://www.exemple/ontologie#IndicesDeForme3D">
<rdfs:subClassOf>
<owl:Class
rdf:about="http://www.exemple/ontologie#Modle3D"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:about="http://www.exemple/ontologie#Points">
<rdfs:subClassOf>
<owl:Class rdf:about="http://www.exemple/ontologie#Mesures"/> Fig. 7: Partial hierarchy of relationships.
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:about="http://www.exemple/ontologie#Lignes">
The notion of position, distance and orientation in spatial
<rdfs:subClassOf> relations are dependent on the notion of the frame of
<owl:Class rdf:about="http://www.exemple/ontologie#Mesures"/> reference. The object centroid is used as the frame of
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 177
reference to compute measures and to respect proprieties: 8. Method for Classification Database
Rotation and translation. Then, the method does not
require preprocessing for these properties. The bounding Each model of database, in our content based indexing and
box centroid is used as the frame of reference to describe retrieval system for 3D models, is represented by two
concept of position, distance and orientation. Therefore, to descriptors considered signatures of the 3D model:
calculate the position "centered", we should calculate the semantic concept and 3D shape indexes. To increase the
Euclidean distance between the center of the 3D model identification rate and decrease the time to search for
and bounding box centroid. Entities such as the 3D model items, we have developed and implemented a
centroid, lines (e.g. radii, diameter and axes), plan and its classification by applying the k-Means algorithm in the 3D
minimum bounding box are used to calculate distances in shape index space. K-Means is an efficient classification
order to provide spatial information. The distances can be approach and very easy. Each model of database is
computed from a point to point, line to line, point to line, clustered by the K-Means algorithm using the Euclidean
point to plan and line to plan. In practice, we used the distance as a similarity measure. Classification based on
following distances: Distance between radii, Distance 3D shape indexes allows a global classification of models
between radii and Diameter, Distance between two and it can detect major differences between shapes. Fig. 8
centers: 3D model centroid and bounding box centroid and shows some classes of objects.
A3, D1, D3, D4 introduced in [4].
To describe the distance relationship between two 3D
models, the following distances are usually used: very
near, near, far, far away. However, such distance
relationships single are not sufficient to represent the 3D
model content ignoring the topological and directional
relationships. To get an idea about the overall direction of
the entities in the 3D model, main axes can be used. In
fact, the main axes of the 3D model can be calculated,
employing the principal component analysis method, and
the value of its direction is given by the angle with the
axes of the bounding box. The example is the following
relationship: RightOf; LeftOf; Above; Below...
We are also interested in topological relationships among
entities that are related to how objects interconnect. In this
paper, we adopt the topological relationships as shown in
Fig. 8: 3D models of some Clusters
table 1. The RCC-8 [24] [25] relations can be used for
taking into account spatial relations. RCC (Region
Connection Calculus) is a logic-based formalism to 3D models are classified into clusters regardless of their
symbolically represent and reason with topological spatial positions and according to the similarity of their 3D
properties of objects [14]. Topological reasoning can be shape index.
implemented based on Pellet engine [21].
r ( SI ) r ( SI )
example, the query: "Show all 3D Models URLs of a aR , aP (12)
i 1
n( SI ) i 1
r ( SI ) w( SI )
given cluster with a high sphericity and variance" is
written in language SPARQL: "n(SI)" is the number of models labeled by SI. "r(SI)" is
String jungle=jenaTools.findBasicNameSpace(ONT_MODELE); the number of models initially labeled by SI and the
String prolog1 = "PREFIX jungle: <"+jungle+">" ; system which has returned with the same SI. "w(IF)"
number of the unlabeled model by the SI and found by the
String qr=prolog1 + NL+"select* where " +
system with the same SI. F-measure F is the weighted
"{" + harmonic mean of precision and recall. The formula of F-
"?3Dmodel jungle:hasCluster ?hasCluster FILTER (?hasCluster = measure is as follows:
"+Cluster+") ." + aRaP
F 2 (13)
"?3Dmodel jungle:hasSpherecity '"+sphericity+"' ." + aR aP
"?3Dmodel jungle:hasVarianceSurfacique '"+variance+"' ." + When using average recall (aR) and precision (aP), it is
"?3Dmodel jungle:hasURL ?hasURL " +
important to specify the number of shape indexes for the
finding of at least one model.
" }";
collection. During this process, shape indexes are In order to retrieve 3D models by introducing the semantic
computed, and we can directly retrieve models as will be descriptor Fig. 11 (b) and Fig. 12 (d), the query is labeled
shown in Fig. 11 (a) and Fig. 12 (c) using our descriptor or before the search happens with a semantic concept by
Area Volume Ratio Descriptor [5] that is not efficient. associating 3D shape low-level features with high-level
semantic of the models.
The 12 most similar models are extracting and returning to
user by 2D images. To visualize the 3D models in the 3D
space, the user clicks the button or image.
(c)
(a)
(d)
Fig. 12: (c) Models found with our descriptor without introducing
(b) semantic descriptors. (d) Models found with our descriptor introducing
the semantic descriptor.
Fig. 11: (a) Models found by Area Volume Ratio descriptor without
introducing the semantic descriptor. (b): Models found with Area Volume For the evaluation of the performance of our system based
Ratio descriptor introducing our semantic descriptor. on shape indexes and semantic concepts descriptors, we
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 180
2
Department of Computer Engineering, Science and Research Branch, Islamic Azad University
Tehran, Iran
3
Department of Mathematical Sciences, Sharif University of Technology
Tehran, Iran
researchers found out that models use CAS and MAS 2.1 Capra's Conceptual Framework as a Worldview
(Multi-Agent System) to model nonlinear dynamic
interactions that have been missing in the previous linear Capra's conceptual framework is based on new
models [8]. However, it is suitable to utilize a thought interpretations and definitions of cognition. Cognition is
structure that makes modeling and simulation of complex the process of knowing in life; knowing how and what
systems more accurate and produces a high quality capabilities are used for survival. With this definition, the
software simulation. smallest living organisms, such as cells are cognitive
phenomena using cognition for survival in life. Defining
In this paper, first the necessity of a suitable complex cognition based on biological view enables us to use the
systems modeling worldview is explained and then it cognitive concepts in a wide range to explain the behavior
illustrated by Capra's conceptual framework. Then a of living organisms. We can find network patterns
thought structure for complex systems modeling with everywhere, from the smallest cognitive living organisms
regard to Popper's Three Worlds is proposed. The first such as cells to organizations and human societies. Thus,
world is about complex systems worldview, the second network is a common pattern for life [10].
world is about individual and social awareness and finally
the third world is an artifact that is a methodology for One of the cognitive theories based on the biological view
simulator development. is Santiago theory [10]. According to this theory,
cognition is synonymous with the process of life. The
organizing activities of living systems at all levels of life
2. Complex Systems Modeling Worldview are cognitive. These activities include interactions among
living organisms such as plants, animals, or human beings
In complex systems, global behavior emerges form high and their environment. Thus life and cognition are
number of interactions between components [3][4]. As the inseparable, as though mental activity is immanent in
number of interactions is very high, the emergent behavior matter. Santiago's cognitive theory expands the cognitive
appears. Therefore, for understanding and modeling of concept in a way that it involves the entire process of life
complex systems, a special worldview is required. This including perception, emotion, and behavior. In this theory,
worldview is the base of some methodologies such as cognition is not just for human beings with a brain and a
CommonKADS and it precedes theory [19]. nervous system, rather it can be for each living organism,
from cells to social organizations [10].
Overall, the methods that have been used for systems
modeling during the past decades can be divided into two Capra has presented a unique framework for
main approaches: understanding the biological and social phenomena in four
perspectives. Three out of these four perspectives is about
1) Model-oriented approach: It is based on methods life and the fourth one is meaning. (Fig. 1)
of traditional system thinking. Worldview of this
approach is based on reductionism. Reductionism
is breaking a problem into smaller ones, solving Process
each one separately and then combing the
answers to get the solution of the main problem.
In other words, for understanding the main
system, we divide it to sub-systems and they can
be further divided into smaller systems until we
get to the systems that are knowable. Meaning
2) Data-oriented approach: The main idea of this Pattern Structure
approach is that complex systems cannot be
understood with reductionism worldview.
Therefore, as the behavior of the system is from Fig. 1 Four perspectives of Capra's Conceptual Framework.
bottom to top, for understanding it we need a new
holistic worldview. In this worldview, emergent
behavior becomes meaningful. It is according to The first perspective of Capra's conceptual framework is
this worldview that complex systems theories, pattern that includes various relations among system
cognitive theories, and other theories based on components. The organization pattern of a living system
the new holistic thinking are used in complex defines the relation types among the system components
systems modeling. which determines the basic features of the system.
Structure, the second perspective, is defined as the
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 184
World 3
Artifact Computational
Models
Software
Architecture
World 2
Social Individual
Theoretical Basics
codes. Therefore, this layer is the provider of a formal Framework is based on modern cognitive theories;
language for simulator software. Computational models therefore, we have used its modified version as the
are chosen based on theoretical basics and software proposed thought structure worldview. This thought
architecture (Fig 4). For example, we can use soft structure is based on Popper's Three Worlds. The first
computing such as genetic algorithms, neural networks, world is the complex systems modeling worldview that we
and fuzzy computations in this layer. have redefined in three perspectives of agent, network,
and process. The second world is individual and social
awareness that concerns with individual and shared
3.2 Development Methodology as an Artifact situation awareness. The third world is an artifact that
explains methodology for complex systems modeling. In
Two worlds out of Popper's Three Worlds for complex other words, the artifact determines general principles and
systems modeling have been described so far. The first approaches for the software architecture.
world is the complex systems modeling worldview that we
redefined in three perspectives of agent, network, and References
process, based on Capra's conceptual framework. The [1] C. Gros, Complex and Adaptive Dynamical Systems: A
second world is individual and social awareness of agents Primer, Springer-Verlag Berlin Heidelberg, 2008.
that is essential for their coordination. [2] A. Yang, and Y. Shan, Intelligent Complex Adaptive
Systems, IGI Publishing, 2008.
We call the third world of Popper's Three Worlds artifact [3] J. H. Miller, and S. E. Page, Complex Adaptive Systems: An
(Fig. 3). This world is objective knowledge and is Introduction to Computational Models of Social Life,
falsifiable, that is, it is true as long as we cannot prove its Princeton University Press, 2007.
falseness. Artifact is a methodology in our proposed [4] J. Clymer, Simulation Based Engineering of Complex
Systems, Wiley-Interscience, 2009.
thought structure. This methodology determines what
[5] C. F. Kurtz and D. J. Snowden, "The New Dynamics of
concepts, components, and methods should be used for Strategy: Sense-making in a Complex and Complicated
complex systems modeling. In other words, it illustrates World", IBM Systems Journal, Vol. 42, No. 3, 2003, pp.
and confirms the effect of meaning perspective (the fourth 462-483.
perspective of Capra's Conceptual Framework) in the form [6] C. A. Aumann, "A Methodology for Developing Simulation
of some general principles. In a way, meaning is the Models of Complex Systems", Ecological Modelling,
interpretation of simulation results. We can interpret the Vol.202, No. 3-4, 2007, pp. 385-396.
results of simulation according to a given meaning. [7] M. A. Janssen, and W. J. M. Martens, "Modeling Malaria as a
Overall, this methodology determines general principles Complex Adaptive System", Artificial Life, Vol. 3, No. 3,
1997, pp. 213-236.
for software architecture. For example, what principles
[8] A. Yang, "A Networked Multi-Agent Combat Model:
and structures should be used for network design? What Emergence Explained", Ph.D. thesis, University of New
features are more important for agent design and definition? South Wales, Australian Defiance Force Academy, 2006.
What kinds of processes are suitable for modeling a given [9] C. Joslyn, and L. Rocha, "Towards Semiotic Agent-Based
complex system? Models of Socio-Technical Organizations", AI, Simulation
and Planning in High Autonomy Systems (AIS 2000)
The principles obtained from the results of modeling and Conference, Tucson, Arizona, 2000, pp. 70-79.
simulation can be used in the design of products and real [10] F. Capra, The Hidden Connections: Integrating the
applications. That is, these principles are used in the Biological, Cognitive, And Social Dimensions of Life Into A
Science of Sustainability, Doubleday, 2002.
design of agent, network, and process in order to create a
[11] N. Gilbert, and K. G. Troitzsch, Simulation for the Social
given meaning. They can be reviewed and revised after Scientist", Open University Press, McGraw-Hill Education,
being used in real applications. Second Edition, 2005.
[12] N. Cannata, F. Corradini, E. Merelli, A. Omicini, and A.
Ricci, "An Agent-oriented Conceptual Framework for
4. Conclusions Biological Systems Simulation", Transaction on
Computation System Biology Vol. 3, 2005, pp.105-122.
Complex systems modeling is one of the challenges and [13] A. Ilachinski, Artificial War: Multiagent-Based Simulation
necessities of today's researchers which demands a of Combat, Singapore, World Scientific Publishing
suitable thought structure. Many researchers consider a Company, 2004.
living system as a complex system that adapts to its [14] M.A. Niazi, and A. Hussain, "A Novel Agent-Based
surrounding environment for survival and evolution. Simulation Framework for Sensing in Complex Adaptive
Environments", IEEE Sensors Journal, Vol. 11, No.2, 2010,
Consequently, cognitive theories and thought frameworks
p.p. 404412.
suggested for describing living systems can be utilized for
understanding complex systems. Capra's Conceptual
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 187
2
Department of Applied Mathematics and Informatics, University of Cadi Ayad, Faculty of Science and Techniques
Marrakech, Morocco
1. Introduction
Many methods have been used to estimate canopy
evapotranspiration from regions using standard climate
data. Priestley-Taylor approximation suggest one of these
based on physical argument about processes in the whole
of turbulent planetary boundary layer, and their arguments
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 189
2. Study area and data collection Soil temperature was recorded at 5 cm depth at two
locations approximately 30 m from the tower. Three heat
flux plates continuously monitored changes in soil heat
2.1 Site description storage at the tower site. In addition, five point
measurements of soil moisture variables were located
The study site was located in the 275 hectare Agdal olive throughout the site. Each point contained a pair of steel
(Olea europaea L.) orchard in the southern side of rods for time domain reflectometry (TDR) measurements
Marrakech City, Morocco (31,601 N; 07,974 W). It is at 40, 30, 20, 10 and 5 cm depths to estimate volumetric
characterized by low and irregular rainfall (annual average water content. Olive transpiration was measured by sap
of about 240 mm, but 263.4 mm has been collected in flow method following the procedure of Williams et al.,
2003). The climate is typically Mediterranean semi arid; 2003. Soil evaporation was computed as the difference
precipitation falls mainly during winter and spring, from between evapotranspiration measured by eddy correlation
November to April. The atmosphere is very dry with an system and transpiration measured by sap flow method.
average humidity of 56% and the evaporative demand is
very high (1600mm per year), greatly exceeding the
annual rainfall. The orchard was periodically surface 3. Priestley-Taylor transpiration in TSEB
irrigated through level basin flood irrigation, with water Model
supplies of about 100 mm every each irrigation event. We
have approximately 3 irrigation events during summer The Priestley-Taylor equation is only an initial
2003. Each tree was occupied over 45 m2, and bordered approximation of canopy latent heat simulated by TSEB
by small earthen levy (about 30 cm) retained irrigation Model. TSEB is based on energy balance closure using
water (Williams et al, 2004). Plant spacing was about surface radiometric temperature, vegetation parameters
(6.5x6.5 m); the trees had an average leaf area index (LAI) and climatic data. TSEB outputs surface turbulent fluxes,
of 3. Mean tree height was 6 m and ground cover was 55% and temperatures of canopy and soil. The version
(Ezzahar, 2007). implemented in this study basically follows what is
described in appendix A as the parallel resistance
2.2 Measurements network. As such, the model implemented is described in
detail in (Norman et al. 1995, Kustas and Norman 1999).
Measurements were acquired at a sampling frequency of The canopy latent heat LEc is given by Priestly-Taylor
20 Hz and passed through a low-pass filter to compute 30- approximation (Priestly-Taylor. 1972).
min flux averages. Intensive data were collected in Agdal
site. Vertical fluxes of heat and water vapor at 9.2 m
height were registered on twelve month of 2003 and are
measured by an Eddy-Covariance (EC) system (Ezzahar et (1)
al, 2007). Finally, the resulting dataset of sensible and
latent heat fluxes were available for the 2003 growing where p is the Priestly-Taylor constant, which is initially
seasons, with missing data for few days due to power set to 1.26 (Priestley-Taylor, 1972; Norman et al 1995;
supply troubles. Almost 6247 hourly observations, during Agam et al 2010), fg is the fraction of the LAI that is
daytime, everyday along the year 2003 without any green, is the slope of saturation vapour pressure versus
exclusion related to season or climatic conditions, were temperature curve, is the psychrometer constant (e.g:
used to run and evaluate TSEB model output. 0.066 kPa C- ). If no information is available on fg, then
A 3D sonic anemometer (CSAT3, Campbell Scientific, it is assumed to be near unity.
Logan, UT) measured the fluctuations in the wind velocity
components and temperature. An open-path infrared gas
analyzer (LI7500, LiCor, Inc., Lincoln, NE) measured 4. Genetic algorithms method
concentrations of water vapour. The wind speed and
concentration measurements were made at 20 Hz on
CR23X dataloggers (Campbell Scientific, Logan, UT) and 4.1 Overview
on-site portable computers to enable the storage of large
raw data files. Air temperature and humidity were Genetic Algorithms (GAs) are an optimization algorithms
measured at 8.8 and 3.7 m heights on the tower with based on techniques derived from the genetic and the
Vaisala HMP45C probes. Total shortwave irradiance was Darwins theory of evolution in selection, crossover,
measured at 9.25 m height with a BF2 Delta T radiometer. mutation, generation, parent, children, etc (Goldberg 1989;
Net radiation was measured with a Kipp and Zonen CNR1 Holland 1975). As a considerable development in the
net radiometer placed over the olive canopy at 8 m height. computing systems, GAs has shown a significant
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 190
improvement by using stochastic and mathematic methods where N is the dimension of population, such that each
which has been applied into many domains such as element of array contains a possible value of parameters;
ecologies, biology and even economy, in order to and rand (2,N ) returns a pseudorandom vector value are
experiment it for understanding natural systems, and drawn from a uniform distribution on the unit interval.
modelling it to optimize (or at least improve) the
performance of the system. The GA moves from generation to generation selecting
and reproducing parents until a termination criterion is met.
4.2 GAs theoretical bases and implementation The most frequently used stopping criterion is a specified
maximum number of generations.
Genetic algorithms have been used to solve difficult
problems with objective functions that do not possess Fitness in biological sense is a quality value which is a
some properties such as continuity, differentiability, measure of the reproductive efficiency of chromosomes
satisfaction of the Lipschitz Condition, etc (Michalewicz (Goldberg, 1989). In genetic algorithm, individuals are
1994; Goldberg 1989; Holland 1975). evaluated with it fitness function which is a measure of
GAs search extremum of function defined in space data. goodness to be selected.
These algorithms maintain and manipulate a family, or
The evaluation is calculated at each TSEB run through the
population, of solutions and implement a survival of
fitness function (K) which is equal to
fittest strategy in their search for better solutions. GAs
have shown their advantages in dealing with the highly
non-linear search spaces that result from noisy and
multimodal functions.
The genetic algorithm works as follows: (4)
- Initialization of parent population randomly
- Evaluation (fitness function)
- Selection where (t) is the instant of observed latent heat LEobs(t) and
- Recombination of possible solutions (Crossover and LEsim(t,K) is the simulated latent heat.
Mutation)
The cost function to minimize is represented by a practical
- Evaluate child and go to step 3 until termination
criteria satisfies. evaluation of (K) where
members. The population will be represented by a slice -START: Create random population of 10 chromosomes
that is directly proportional to the members fitness. between 0.5 to 2 for p, and 0.1 to 1 for
fg,
Crossover: A crossover operator is used to
recombine pairs of parents to get better children which -Run TSEB: Calculate the simulated latent heat
generate a second generation of solutions. In the case of LEsim(t,K) , the bias to measured latent heat LEobs(t)
individual probability is less than 0.5, the son child and the function cost (K) ,
chromosome will be an average of two times value of
father with one value of mother, and vise versa for the -FITNESS: Evaluate the fitness function (K) of each
daughter child, but if individual probability is great or chromosome in the population,
equal to 0.5, the son and daughter chromosome will stay
-NEW POPULATION:
respectively like father and mother.
* SELECTION : Based on (K)
Mutation: Mutation is an operator that introduces
diversity in the population to avoid homogeneous * RECOMBINATION: Cross-over chromosomes
generation due to repeated use of reproduction and
crossover operators. Mutation proceeds to Gaussian * MUTATION : Mutate chromosomes
perturbation with deviation equal to 0.5 and probability
mutation equal to 0.0001. Mutation adds simply new * ACCEPTATION : Reject or accept new one
information in a random way to the genetic search process. -REPLACE : Replace old with new population as the new
generation
4.2.4 Implementation of GAs to TSEB Model
Possible solutions to a problem are evaluated and ordered -TEST : Test problem criterion to indicate the best
according to its adaptation (i.e: fitness function). From solution minimizing the cost function
generation (k) to new one (k+1), then other chromosome (K) , else to turn over to the next generation
populations are produced after selecting candidates as LOOP : Continue step 2 6 until criterion is
parents and applying mutation or crossover operators satisfied.
which combine chromosome of two parents to produce
two children. The new set of candidates is then evaluated,
and this cycle continues until an adequate solution is 5. Results
found (figure.1). In all experiments, GA experimental
parameters are as follows: the population size is 10, the
crossover rate is 0.5, the mutation rate is 0.0001 and we Different number of generations (not shown) with ten
generate population until the 10th generation. The individuals population have been experimented in order to
observations used in TSEB Model are taken each 30 optimize values of and to carry out
minutes. In this optimization we want to minimize the cost stability test to GA with showing performance to
function, then we proceed the minimization to find a Priestley-Taylor formulation. The founded parameters by
vector Kopt as follows: GA are changing with reproduction in generations. The
GA start generally with a randomly values of parameters
(7)
in the beginning of minimized cost function [ ], but
where is the vector of parameters to be in the absence of stopping criterion to the most minimizing
controlled, and (K) is the cost function. cost function, the GA change choice to selected
The state variable is the simulated latent heat LEsim(t,K) individuals who decrease Latent heat error to reach its
evolving in the time during summer 2003 between minimum. The GA continues to generate elite
DOY=152 to DOY=243. The cost function is computed chromosomes for computing predicted surface fluxes until
by comparing simulated LEsim and observed latent heat stability of Latent heat error (fig.2). The stability error
LEobs during the all period T. The two unknown phase is characterized by a little changing in reproductive
parameters controlling the Priestley-Taylor transpiration individuals adaptation. The convergence will be reached
used in TSEB Model are estimated by optimization of the during generation when the best individual is founded to
cost function with the evolution strategies algorithm as the medium one (fig.1). The estimation of Priestley-Taylor
follow: formulation has been improved then the TSEB Model
performance will come acceptable with best parameters
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 192
giving by 10 generations. We proceed in the following to respectively equal 0.45, +15 W.m-2, and 63 W.m-2. Thus,
experiment 10 runs of GA to show best parameters the results obtained in this study show the most important
changing and test stability reproduction procedure with 10 support of Genetic Algorithm in the calibration and
individuals population and 10 generations. During error optimization processes. This GAs optimization could
stabilization error process, the 10 runs of GA shows replace measures terrain and long experiments since it
(table.1) changing in parameters value, since p is ranging improve results mostly by making use of fitness function
between 0.72 to 1.00 and fg vary from 0.26 to 0.79. These and genetic operators such as selection, crossover and
optimized values for p and fg are less than the standard mutation. However, the set of canopy transpiration was
value (p =1.26 and fg=1 for wet conditions), then we can improved.
considered them for semi arid area. Optimized values for
fg are conforming to irrigated area explaining conditions
supporting soil and canopy transpiration. GA gives Appendix A
sometimes optimal parameters corresponding to minimum
error before reaching its stabilization, but GA continue
computing process since there is no stopping criterion for
TSEB Equations
this case to reduce calculation time. The mean parameters
value optimized in 10 previous runs of p and fg (table.1) Soil and vegetation temperature contribute to the
are respectively 0.93 and 0.61. Now let us see the radiometric surface temperature in proportion to the
influence of these optimal mean values to TSEB Model. fraction of the radiometer view that is occupied by each
Figures 3 and 4 present the comparison of measured and component along with the component temperature. In
predicted daily latent heat before and after optimization particular, assuming that the observed radiometric
process. These figures show an improvement of latent heat temperature, (Trad) is the combination of soil and canopy
representativeness. The correlation becomes from (0.43) to temperatures, the TSEB model adds the following
(0.45), the bias is reduced from (+240 W.m-2) to (+15 relationship (Becker and Li, 1990) to the set of (Eqs 12
W.m-2), and the root mean square have been improved and 13):
from (251 W.m-2) to (63 W.m-2). Furthermore the
measured and predicted latent heat evolve both in the same Trad() = [f(). Tc4 + (1-f()) . Ts4]1/4
direction expect during irrigation event, because soil is (A.1)
submerged by traditional irrigation system water.
where Tc and Ts are vegetation and soil surface
temperatures, and f() is the vegetation directional
6. Conclusion fractional cover (Campbell and Norman, 1998).
In this comparison of cases studied here, we observe that f() = 1 exp(-0.5 LAI / cos()) (A.2)
GA stability is essential to optimize parameters . The
results obtained dont change significantly from each 10 The simple fractional cover (fc) is as follows:
(A.19)
(A.24)
-3
Where (Kgm ) is the air density, Cp is the specific heat
of air (JKg-1 K-1), Ta (K) is the air temperature at certain Where Ua is the wind speed above the canopy at height Zu
reference height, H is a sensible heat flux, LE is a latent and the stability correction at the top of the canopy is
heat flux, and is the latent heat. assumed negligible due to roughness sublayer effects
Friction velocity is a measure of shear stress at the surface, (Garratt, 1980; Cellier et al, 1992).
and can be found from the logarithmic wind profile
relationship: TSEB implementation and algorithm
In order to solve (A.15) additional computations are [4] Campbell. G. S, and Norman. J. M, An Introduction to
needed to determine soil temperature, and the resistance Environmental Biophysics (2nd ed.). New York: Springer-
terms Rah and Rs but as will become apparent, they must Verlag., 1998.
be solved iteratively. Soil temperature is determined from [5] Castellvi. F, Stockle. C.O, Perez. P.J, Ibanez, M,
Comparison of methods for applying the PriestleyTaylor
two equations: one to relate the observed radiometric equation at a regional scale Hydrol. Process. 15, 2001. pp.
temperature to the soil and vegetation canopy temperature, 16091620.
and another to determine the vegetation canopy [6] Cellier et al, Flux-gradient relationships above tall plant
temperature. The composite temperature is related to soil canopies, Agric. For. Meteorol. 58, 1992. Pp. 93-l17.
and canopy temperatures by (A.1). The resistance [7] Choudhury. B.J, Idso. S.B, and Reginato. R.J, Analysis of
components are determined from (A.16), for Rah and the an empirical model for soil heat flux under a growing wheat
following equation (Sauer et al., 1995) for Rs (A.18). crop for estimating evaporation by an infrared-temperature
To complete the solution of the soil heat flux components, based energy balance equation, Agric. For. Meteorol., 39,
the ground stock heat flux can be computed as a fraction 1987. pp. 283-297.
[8] Ezzahar. J, Spatialisation des flux dnergie et de masse
of net radiation at the soil surface (A.8).
linterface Biosphre-Atmosphre dans les rgions semi-
Applying energy balance for the two source flux arides en utilisant la mthode de scintillation, Ph.D. Thesis
components resolves the surface fluxes, which cannot be University Cadi Ayyad, Marrakech, Morocco, 2007.
reached directly because of the interdependence between [9] Garratt et al, Momentum, heat and water vapor transfer to
atmospheric stability corrections, near surface wind speeds, and from natural and artificial surfaces,. Q. J. R. Meteorol.
and surface resistances (A.16-17). In these equations, the Sot., 99, 1973. pp.680-687.
[10] Goldberg et al. A Comparative Analysis of Selection
stability correction factors and H depend upon the Schemes Used in Genetic Algorithms, Foundations of
surface energy flux components H and LE via the Monin- Genetic Algorithms, G.Rawlins, ed. Morgan-Kaufmann. Pp
Obukhov roughness length Lmo. 69-93
TSEB computation for solving the surface energy balance [11] Goudriaan. J, Crop Micrometeorology: A Simulation
by ten primary unknowns and ten associated equations Study, Center for Agricultural Publications and
(Table.1), needs an iterative solution process by setting a Documentation, Wageningen. 1977.
large negative value to Lmo (i.e: in highly unstable [12] Holland. J, Adaptation In Natural and Artificial Systems
atmospheric conditions). This permits an initial set of University of Michigan Press. 1975.
[13] Kustas. W.P, Norman. J.M, Evaluation of soil and
stability correction factors M and H to be computed.
vegetation heat flux predictions using a simple two-source
Computed iteration is repeated until Lmo converges. model with radiometric temperatures for partial canopy
cover, Agric. For. Meteorol. 94, 1999a, pp. 7594.
[14] Kustas. W. P, & Norman. J. M, A two-source energy
Acknowledgments balance approach using directional radiometric temperature
observations for sparse canopy covered surfaces, Agronomy
This work is considered within the framework of research Journal, 92, 2000. Pp. 847-854.
between the University of Cadi Ayad Gueliz, Marrakech, [15] Kustas et al, Utility of radiometric-aerodynammic
Morocco, and the Department of National Service of temperature relations for heat flux estimation, Bound.-Lay.
Meteorology, Morocco (DMN, Morocco). The first author Meteorol., 122, 2007. pp.167187,
is very grateful for encouragement to all his family [16] McNaughton. K. G, and T. W. Spriggs, An evaluation of
especially to Mrs F. Bent Ahmed his mother, Mrs K.Aglou the Piestley and Taylor equation and the complimentary
his wife and Mr Mahjoub .Mouida his brother and his relationship using results from a mixed-layer model of the
sister Khadija Mouida convective boundary layer, T. A. Black, D. L, 1987. pp. 89-
104
Finally the authors gratefully acknowledge evaluation and [17] McNaughton. K. G, & Jarvis. P. G, Effects of spatial scale
judgments by reviewers, and the editor. on stomatal control of transpiration, Agricultural and Forest
Meteorology, 54, 1991. pp. 269-301.
[18] Michalewicz. Z, Genetic Algorithms and Data Structures,
References Evolutionary Programs, Springer-Verlag, AI Series, New
[1] Agam et al, "Application of the Priestley-Taylor Approach in York. 1992.
Two Source Surface Energy Balance Model", Am Meteo Soc, [19] Norman et al, Source approach for estimating soil and
Journal of Hydrometeorology, Volume 11, 2010, pp. 185- vegetation energy fluxes in observations of directional
198. radiometric surface temperature, Agricultural and Forest
[2] Becker. F, and Li. Z.L.Temperature independent spectral Meteorology 77, 1995. pp. 263-293.
indices in thermal infrared bands, Remote Sensing of [20] Priestley. C. H. B, & Taylor. R. J, On the assessment of
Environment, vol. 32, 1990, pp. 17-33. surface heat flux and evaporation using large-scale
[3] Brutsaert. W, Evaporation Into The Atmosphere, D. Reidel, parameters, Monthly Weather Review, 100, 1972. pp. 81-
Dordrecht. 1982. 92.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 196
Second Author received his Master of Science and his Ph.D. Fig. 1 Iterative procedure of a Genetic Algorithm to TSEB Model
degrees from the University of Nancy I France respectively in 1986
and 1989. In 2006, he received the HDR in Applied Mathematics
from the University of Cadi Ayyad, Morocco. He is currently
Professor of modeling and scientific computing at the Faculty of
Sciences and Technology of Marrakech. His research is geared
towards non-linear mathematical models and their analysis and
digital processing applications.
Figures
2
Software Engineer, IBM
India
original regression test suite. In this proposed approach, it examines the costs and benefits of test suite minimization.
is shown that the two aspects of testing, that is testing for Rothermel et al [7] described several techniques for using
functionality and testing for boundary values can be tested test execution information to prioritize test cases for
with reduced test suite as these two aspects can be tested regression testing, including: techniques that order test
together simultaneously in most of the situations. The cases based on their total coverage of code components,
situations where these two aspects can be tested techniques that order test cases based on their coverage of
simultaneously, is also shown with help of the case-studies. code components not previously covered, and techniques
In this paper, testing simultaneously means, a single test that order test cases based on their estimated ability to
case can cover both the above mentioned aspects for a reveal faults in the code components that they cover.
particular situation. The proposed approach is applied on Most of the techniques described in the above papers
four real-time case studies and also estimated the assume that source code of the software is available to the
reduction in cost of regression testing using a cost testing engineer at the time of testing. But in most of the
estimation model. It is found that the reduction in cost per organizations the testing is done in black box environment
one regression testing cycle is ranging between 19.35 and and the source code of the software is not available to the
32.10 percent. Since regression testing is more frequently testing engineers. In this paper, an approach to reduce cost
done activity in software maintenance phase, the overall of software regression testing in black box environment,
regression testing cost can be reduced considerably by without affecting the functionality coverage, is presented.
applying the proposed approach.
The rest of the paper is organized as follows: Section II 3. The Proposed Approach
reviews the various regression testing techniques and
summarizes related work. Section III describes the The estimated cost of software maintenance exceeds 70%
proposed approach to cost effective regression testing for of total software costs [16], and large portion of this
black-box testing environment. Section IV describes the maintenance expense is devoted to regression testing.
Empirical studies and results of the proposed approach. Regression testing is a frequently executed activity, so
Section V concludes and discusses future work. reducing the cost of regression testing would help in
reducing cost of the software maintenance.
The proposed approach is shown in three phases
2. Related Work (Fig.1). In Phase 1 (Fig. 1), the Reduced Test Suite is
derived by applying the proposed approach on the
Researchers, practitioners and academicians proposed Original test suite. Phase 1 of the approach is already
various techniques on test suite reduction, test case proposed by the authors in [17], and in Phase 2 (Fig. 1),
prioritization, and regression test selection for improving the Reduced Regression Test Suite is derived by
the cost effectiveness of the regression testing. applying a regression test selection method on the
Reduced Test Suite that is derived in the Phase 1. In
Rothermel and Harrold presented a technique for Phase 3, a testing cost-estimation model is applied on the
regression test selection. Their algorithms construct reduced regression test suite and empirically calculated the
control flow graphs for a procedure or program and its regression testing cost reduction by the proposed approach.
modified version and use these graphs to select tests that
execute changed code from the original test suite [9].
James A. Jones and Mary Jean Harrold proposed new Phase 1: Deriving the Reduced Test Suite
algorithms for test suite reduction and prioritization [2].
Saifur-Rehman Khan, Aamer Nadeem proposed a novel A large number of test cases are derived by applying
test case reduction technique called TestFilter that uses the various testing techniques to test complete functionality of
statement-coverage criterion for reduction of test cases [3]. a software product. This test suite contains test cases to
T. Y. Chen and M. F. Lau presented dividing strategies for test functionality, boundary values, stress, and
the optimization of of a test suite [4]. M. J. Harrold etal performance of the software product. Majority of these
presented a technique to select a representative set of test test cases will be test cases that test the functionality and
cases from a test suite that provides the same coverage as boundary values. The Phase 1 of the proposed approach is
the entire test suite [5]. This selection is performed by focused on reducing test cases considering test cases that
identifying, and then eliminating, the redundant and test functionality and boundary values.
obsolete test cases in the test suite. This technique is
illustrated using data flow testing methodology. A recent
study by Wong, Horgan, London, and Mathur [6],
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 200
Phase 1
Reduced
Reduced Test Regression Test Selection Regression Test
Suite Method Suite
Phase 2
The Phase 1 (Fig.1) of the approach contains the Let P is the original software product, P is the
following four steps: modified software product and T is the set test cases to test
1. View the two aspects that is functionality and P. A typical regression testing on modified software
boundary value testing together proceeds as follows:
2. Identify the situation(s) (considering functionality A. Select T T , a set of test cases to execute on
and boundary values) which can be tested in single test
the modified software product P .
case(s) so as to design minimal test cases
3. Proving logically that the single test case(s) in-fact B. Test P with T , to verify modified software
covering both the aspects. products correctness with respect to T .
4. Applying above three steps to case studies and C. If necessary, create T , a set of new test cases to
validating test P .
By applying the above mentioned approach we get D. Test P with new tests T , to verify P
the Reduced Test Suite that covers the same correctness with respect to T .
functionality of the software as the original test suite. This In Phase 1 (Fig 1), the Reduced Test Suite is
is validated in the case studies. derived. In Phase 2 (Fig 1), the Reduced Regression Test
Suite is derived by applying the regression test selection
Phase 2: Deriving the Reduced Regression Test method shown in the Figure 2. This regression test select
Suite ion method contains the following 3 steps:
1. Select a subset of test cases from the reduced test
Regression testing process involves selecting a subset suite (derived in Phase1) which covers the major
of the test cases from the original test suite, and if functionality of the product.
necessary creates some new test cases to test the modified 2. Select test cases that cover the scenarios to test the
software. bug fixes included in the regression build
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 201
3. Create new test cases, to test the (if any) new Verification of the fixed bugs which were
enhancements included in the regression build. reported in the previous testing cycle ( bv )
In step1 of this approach, we are selecting subset of
test cases from the reduced test suite. So, this selected Test Suite execution ( Te )
rg
subset will also contain the less number of tests as Test Report Generation ( )
compared to the subset selected from the original test Test Report Analysis ( ra )
suite. This reduced regression test suite covers the same
Reporting the Bugs ( br )
functionality as the original regression test suite that is
derived without applying our approach.
As the above mentioned actives are performed on an
The reduced regression test suite derived using this
each and every build, they occupies major portion of the
approach is empirically evaluated in the case studies
overall regression testing time. The time required to
section of the paper.
complete regression testing on one intermediate or
Phase 3: Regression Testing Cost Estimation regression build is calculated using the following
equation.
In Phase 3 of the proposed approach we calculate the
estimated reduction in regression testing achieved by
using the proposed approach. The authors proposed an
ib env ((Nt Te) / 60) rg Tra Tbv Tbr
approach to cost estimation in black-box testing (1)
environment in [19]. Using this approach the regression
testing in black-box environment involves the following where, the Te indicates the average time required to
major activities. execute a single test case and the Nt is the total number
of the test cases executed for that particular regression
Environment setup for testing ( env ) testing cycle.
TABLE1. FUNCTIONAL TEST CASES BEFORE APPLYING THE PROPOSED APPROACH OF PHASE 1
Test
Test Case Description Preconditions Expected Result Comments
Status
ID
Test on writing the data to the target table with The job should add new rows to the target table and
TCf1 Action on data = Insert stop if duplicate rows are found.
Test on writing the data to the target table with The job should make changes to existing rows in
TCf2 Action on data = Update the target table with the input data.
Test on writing the data to the target table with The job should add new rows to the target table
TCf3 Action on data = Insert or Update first and then update existing rows.
Test on writing the data to the target table with The job should update existing rows first and then
TCf4 Action on data =Update or Insert add new rows to the target table.
Test on writing the data to the target table with The job should remove rows from the target table
TCf5 Action on data =Delete corresponding to the input data.
TABLE 2. BOUNDARY VALUE TEST CASES BEFORE APPLYING THE PROPOSED APPROACH OF PHASE 1
Test on writing the data to col5 with The job should read the DATE data type boundary values from
TCb5 DATE data type boundary values input data and write to the target table successfully.
A.1. View the two aspects together (Step 1) A.3. Providing logically that the single test case in fact
Many software testing techniques are required to test covers both the aspects (Step 3)
functionality of a software product completely. A large Each test case in the minimized test case set
number of test cases are generated by applying the various described in Table 3 will test the functionality of the DB2
testing techniques. These test cases include: functional test ETL DB Component to ensure that the particular attribute
cases (Tf), Boundary Value test cases (Tb) , Stress test is working properly and also tests the boundary values for
cases (Ts), Performance test cases (Tp) and other test various columns in the target table to ensure that the
cases (To) like negative test cases. boundary values of that column data type are written
Tn = Tf + Tb+ Ts + Tp+ To. properly. For example, the TCm1 in the minimized test
Most of the test cases in this test suite belong to test case set tests whether the DB2 ETL DB Component is
cases that test the functionality and boundary values of the working properly when the attribute Action on Data is
product. The proposed approach in Phase1 is focused to set to Insert and also tests whether the INTEGER data
reduce test cases considering test cases that test type boundary value is written to the target table properly
functionality and boundary values. which were tested by the test cases TCf1 and TCb1.
In similar way, the remaining test cases in the
A.2. Identifying the situations that can be tested in a minimized test case set {TCm1 TCm5} described in
single test case and designing minimized test case set Table 3 will test the both aspects, functionality and the
( Step 2) boundary values of DB2 ETL DB Component which have
The test case TCf1 tests the functionality of the DB2 been tested by the test cases {TCf1-TCf5 and TCb1-
ETL DB Component when the attribute Action on Data TCb5}.
is set to Insert and the test case TCb1 tests the INTEGER
data type boundary value that is written to the target DB2 A.4. Applying the above three steps to case studies and
table. Both of these test cases TCf1 and TCb1 are testing validating (step 4)
the two aspects i.e. functionality and boundary values of
the DB2 ETL DB Component. If the number of boundary value test cases that are
By using the proposed approach in phase1 these two viewed together with functional test cases, the number of
test cases could be viewed together and tested in a single test cases test cases reduced is Tbr. Then, after applying the
test case. For example, the test cases TCf1 and TCb1 are phase 1 of the proposed approach, the total number of test
viewed together and designed a single test case TCm1 cases is minimized to:
(Table 3) that covers the both aspects. The minimized test Tmin =Tn- Tbr
case set designed using the proposed approach in phase 1 And, the percentage of test case reduction (Tred % ) is:
is shown in the Table 3. Tred % = ((Tn - Tmin ) / Tn) * 100
TABLE 3. THE MINIMIZED TEST CASE SET DESIGNED USING THE PROPOSED APPROACH IN PHASE 1
Test Test
Description Preconditions Expected Result Comments
Case ID Status
Test on writing the data to the target table with The job should read the input data, add new
TCm1 Action on data = Insert and col1 contains INTEGER rows to the target table successfully and stop
data type boundary values if duplicate rows are found.
Test on writing the data to the target table with The job should read the input data and make
TCm2 Action on data = Update and col2 contains CHAR changes to existing rows in the target table
data type boundary values with the input data
Test on writing the data to the target table with The job should read the input data, add new
TCm3 Action on data = Insert or Update and col3 contains rows to the target table first and then update
VARCHAR data type boundary values existing rows
Test on writing the data to the target table with The job should read the input data, update
TCm4 Action on data = Update or Insert and col4 contains existing rows first and then add new rows to
DOUBLE data type boundary values the target table
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 205
Test on writing the data to the target table with The job should read the input data and remove
TCm5 Action on data = Delete and col5 contains DATE data rows from the target table corresponding to
type boundary values the input data
Reduced Regression
Original Test Suite ( Reduced Test Suite Original Regression Suite
ETL DB Component Suite- Phase 2
Tn) Phase 1 (Tmin) (TR)
(TRmin)
1304
DB2 ETL DB Component 3563 2609 (26.7 %) 1846
In similar way, the proposed approach is also applied The phase 2 of the approach is applied on four case
on Sybase ETL DB Component, Teradata ETL DB studies and the results are recorded in Table 4. The fourth
Component and MySQL ETL DB Component. The second column in table 4 describes the number of regression test
column of Table 4 describes the total number of test cases cases (TR) that are derived by applying the proposed
(Tn) before applying phase 1 of the proposed approach, the regression test selection method on the original test suite
third column describes the total number of test cases in the (i.e before applying the Phase1 of the proposed approach).
minimized test case suite (Tmin) after applying the phase 1 The fifth column in Table 4 describes the Reduced
of the proposed and the percentage of test case reduction Regression Test Suite (TRmin) which is derived by
(Tred %), given in parenthesis. applying the proposed regression test selection method on
After applying the proposed approach in phase 1, the the Reduced Test Suite derived in Phase1.
total number of test cases for DB2 ETL DB Component, This reduction is independent of the regression test
Sybase ETL DB Component, Teradata ETL DB selection method that is used to select the regression test
Component and MySQL ETL DB Component test cases cases. If the number of test cases in the original test suite
are reduced by 34 %,27 %,30 % and 32 % respectively. is reduced, then subsequently the number of regression
The results indicate that the number of test case reduction test cases also reduced.
is ranging between 27 to 34 percent (Table 4, 3rd column).
Hence the Phase 1 of the proposed approach is validated
Phase 3: Regression Testing Cost Estimation
through case studies.
Phase 2: Deriving the Reduced Regression Test The table 5 presents the required average effort for
Suite each of the testing activities in black-box testing, based the
Regression testing is a critical part of the software historical data derived from analyzing 40 completed
maintenance that is performed on the modified software to software projects [19].
ensure that the modifications do not adversely affect the
unchanged portion of the software.
TABLE 5. AVERAGE TIME REQUIRED FOR TESTING ACTIVIIES
Using the proposed approach for regression test
selection, we have selected a subset of test cases from the Avg.
Testing activity Estimated
reduced test suite (derived in Phase1) which covers the effort
major functionality of the product, selected test cases that
Environment setup for testing 3 Hrs
cover the scenarios to test the bug fixes included in the
regression build, and created new test cases, to test the (if 20 min /
Verification of the fixed bugs
bug
any) new enhancements included in the regression build. 1.2 min /
This derived Reduced Regression Test Suite covers the Test Suite execution
test case
same functionality of the software product as the Test Report Generation 9 min
regression suite that is derived from the original test suite
(without reduction). Test Report Analysis 20 min
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 206
Estimated Cost to
Original Estimated Cost to test the Percentage of
Reduced Regression test the Reduced
ETL DB Component Regression Suite original Regression suite reduced Regression
Suite- Phase 2 (TRmin) Regression Suite
(TR) (TR) testing cost (TRmin)
(TRmin)
1304
DB2 ETL DB Component 1846 4329 3209 25.87 %
Component
MySQL ETL DB 1166
1668 3637 2933 19.35 %
Component
Environment", International Journal of Electrical, Electronics [38] T.J. Ostrand and E.J. Weyuker, Using Dataflow Analysis
and Computer Systems, April 2011. for Regression Testing, Proc. Sixth Ann. Pacific Northwest
[20] H. Agrawal, J. Horgan, E. Krauser, and S. London, Software Quality Conf., pp. 233247, Sept. 1988.
Incremental Regression Testing, Proc. Conf. Software [39] B. Eherlund and B. Korel, Modification Oriented Software
Maintenance, pp. 348357, Sept. 1993. Testing, Conf. Proc.: Quality Week, pp. 117, 1991.
[21] T. Ball, On the Limit of Control Flow Analysis for [40] B. Sherlund and B. Korel, Logical Modification Oriented
Regression Test Selection, Proc. Intl Symp. Software Software Testing, Proc. 12th Intl Conf. Testing Computer
Testing and Analysis, ISSTA, Mar. 1998. Software, June 1995.
[22] S. Bates and S. Horwitz, Incremental Program Testing [41] A.B. Taha, S.M. Thebaut, and S.S. Liu, An Approach to
Using Program Dependence Graphs, Proc. 20th ACM Software Fault Localization and Revalidation Based on
Symp. Principles of Programming Languages, Jan. 1993. Incremental Data Flow Analysis, Proc. 13th Ann. Intl
[23] P. Benedusi, A. Cimitile, and U. De Carlini, Post- Computer Software and Applications Conf., pp. 527534,
Maintenance Testing Based on Path Change Analysis, Proc. Sept. 1989.
Conf. Software Maintenance, pp. 352361, Oct. 1988. [42] F. Vokolos and P. Frankl, Pythia: A Regression Test
[24] D. Binkely, Semantics Guided Regression Test Cost Selection Tool Based on Textual Differencing, Proc. Third
Reduction, IEEE Trans. Software Eng., vol. 23, no. 8, Aug. Intl Conf Reliability, Quality, and Safety of Software
1997. Intensive Systems, ENCRESS97, May 1997.
[25] Y.F. Chen, D.S. Rosenblum, and K.P. Vo, TestTube: A [43] L.J. Wshite and H.K.N. Leung, A Firewall Concept for
System for Selective Regression Testing, Proc. 16th Intl Both Control- Flow and Data-Flow in Regression Integration
Conf. Software Eng., pp. 211222, May 1994. Testing, Proc. Conf. Software Maintenance, pp. 262270,
[26] K.F. Fischer, A Test Case Selection Method for the Nov. 1992.
Validation of Software Maintenance Modification, Proc. [44] L.J. White, V. Narayanswamy, T. Friedman, M.
COMPSAC77, pp.421426, Nov. 1977. Kirschenbaum, P.Piwowarski, and M. Oha, Test Manager,
[27] K.F. Fischer, F. Raji, and A. Chruscicki, A Methodology A Regression Testing Tool, Proc. Conf. Software
for Retesting Modified Software, Proc. Natl Maintenance, pp. 338347, Sept. 1993.
Telecommunications Conf., pp. 16, Nov. 1981. [45] S.S. Yau and Z. Kishimoto, A Method for Revalidating
[28] R. Gupta, M.J. Harrold, and M.L. Soffa, An Approach to Modified Programs in the Maintenance Phase,
Regression Testing Using Slicing, Proc. Conf. Software COMPSAC87: Proc. 11th Ann. Intl Computer Software
Maintenance, pp.299308, Nov. 1992. and Applications Conf., pp. 272277, Oct. 1987.
[29] M.J. Harrold and M.L. Soffa, An Incremental Approach to
Unit Testing During Maintenance, Proc. Conf. Software
Maintenance,pp. 362367, Oct. 1988.
[30] M.J. Harrold and M.L. Soffa, An Incremental Data Flow
Testing Tool, Proc. Sixth Intl Conf. Testing Computer
Software, May 1989.
[31] J. Hartmann and D.J. Robson, RETEXTDevelopment of Prof. Ananda Rao Akepogu received B.Sc. (M.P.C) degree from
a Selective Revalidation Prototype Environment for Use in Silver Jubilee Govt. College, SV Univer-sity, Andhra Pradesh,
Software Maintenance, Proc. 23rd Hawaii Intl Conf. India. He received B.Tech. degree in Computer Science &
Engineering and M.Tech. degree in A.I & Robotics from University
System Sciences, pp. 92101, Jan. 1990. of Hyderabad, Andhra Pradesh, India. He received Ph.D. from
[32] J. Hartmann and D.J. Robson, Techniques for Selective Indian Institute of Technology, Madras, India. He is Professor of
Revalidationk IEEE Software, vol. 16, no. 1, pp. 3138, Computer Science & Engineering and Principal of JNTU College of
Jan. 1990. Engineering, Anantapur, India. Prof. Ananda Rao published more
[33] J. Laski and W. Szermer, Identification of Program than fifty research papers in international journals, conferences
Modifications and Its Applications in Software and authored three books. His main research interest includes
Maintenance, Proc. Conf. Software Maintenance, pp. 282 software engineering and data mining.
290, Nov. 1992.
Kiran Kumar J is pursuing Ph.D. in Computer Science &
[34] J.A.N. Lee and X. He, A Methodology for Test Selection,
Engineering from JNTUA, Anantapur, India and he received his
J. Systems and Software, vol. 13, no. 1, pp. 177185, Sept. M.Tech. in Computer Science & Engineering from the same
1990. university. He received B.E. degree in Computer Science &
[35] H.K.N. Leung and L. White, Insights into Regression Engineering from Amaravati University, India. He has received the
Testing, Proc. Conf. Software Maintenance, pp. 6069, Oct. Teradata Certified Master certification form the Teradata.
1989. Currently he is working for IBM India Software Labs in the area of
Software Testing since 2005. His main research interests include
[36] H.K.N. Leung and L. White, Insights into Testing and software engineering and Software Testing. He is a member of
Regression Testing Global Variables, J. Software IEEE, ACM and IAENG.
Maintenance, vol. 2, pp. 209222, Dec. 1990.
[37] H.K.N. Leung and L.J. White, A Study of Integration
Testing and Software Regression at the Integration Level,
Proc. Conf. Software Maintenance, pp. 290300, Nov. 1990.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 209
Abstract
The Multilingual Information Retrieval System (MLIR) retrieves Irrelevant documents are retrieved by information retrieval
relevant information from multiple languages in response to a model when translations are performed with unnecessary
user query in a single source language. Effectiveness of any terms. Thus translation disambiguation is desirable, so that
information retrieval system and Multilingual Information relevant terms are selected from a set of translations.
Retrieval System is measured using traditional metrics like Mean
Average Precision (MAP), Average Distance Measure (ADM).
Sophisticated methods are explored in CLIR for maintain
Distributed MLIR system requires merging mechanism to obtain translation disambiguation part-of-speech (POS) tags,
result from different languages. The ADM metric cannot parallel corpus, co-occurrence statistics in the target
differentiation effectiveness of the merging mechanisms. In first corpus, the query expansion techniques. Problem called
phase we propose a new metric Normalized Distance Measure language barrier issues raised in CLIR systems [2].
(NDM) for measuring the effectiveness of an MLIR system. We
present the characteristic differences between NDM, ADM and Due to the internet explosion and the existence of several
NDPM metrics. In the second phase shows how effectiveness of multicultural communities, users are facing
merging techniques can be observed by using Normalized multilingualism. User searches in multilingual document
Distance Measure (NDM). In first phase of experiments we show
collection for a query expressed in a single language kind
that NDM metric gives credits to MLIR systems that retrieve
highly relevant multilingual documents. In the second phase of of systems are termed as MLIR system. First, the
the experiments it is proved that NDM metric can show the incoming question is translated into target languages and
effectiveness of merging techniques that cannot be shown by second, integrates information obtained from different
ADM metric. languages into one single ranked list. Obtaining rank list
Keywords: Average Distance Measure (ADM), Normalized in MLIR is more complicated than simple bilingual CLIR.
Distance Measure (NDPM), Merging mechanisms, Multilingual The weight assigned to each document (RSV) is calculated
Information Retrieval (MLIR). not only according to the relevance of the document and
the IR model used, but also the rest of monolingual corpus
to which the document belongs is a determining factor.
1. Introduction
Two types of multilingual information retrieval methods
The Information Retrieval identifies the relevant are query translation and document translation. As
documents in a document collection to an explicitly stated document translation causes more complications than
query. The goal of an IR system is to collect documents query translation, our proposal is applying query
that are relevant to a query. Information retrieval uses translation. Centralized MLIR and distributed MLIR are
retrieval models to get the similarity between the query two type architectures. Our proposed metric is applied on
and documents in form of score. Retrieval models are like distributed MLIR. Distributed MLIR architecture has
binary retrieval model, vector space model, and problems called merging the result lists. Merging
probabilistic model. techniques are like raw score, round robin. Performance of
MLIR system differs due to merging methods. To measure
Cross-language information retrieval (CLIR) search a set the MLIR performance correctly we need to consider the
of documents written in one language for a query in MLIR features like translation (language barrier), merging
another language. The retrieval models are performed methods. Our new metric is based on the concept of ADM
between the translated query and each document. There metric. The drawbacks of the ADM metric are overcome
are three main approaches to translation in CLIR: in the proposed formula.
Machine translation, bilingual machine-readable
dictionary, Parallel or comparable corpora-based methods.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 210
In this paper, Section 2 explains the related work of the Raw score merging strategy: This approach is based on
proposed metric and merging methods. Section 3 explains the assumption that scores across different collections are
the proposed metric in two phases. First phase explains comparable. Raw score sorts all results by their original
newly proposed metric and second phase explains how the similarity scores and then selects the top ranked
proposed metric is applied for merging methods of MLIR. documents. This method tends to work well when same
Section 4 explains the experimental results and section 5 methods are used to search documents [11].
states conclusion.
Normalized score merging: This aprroch is based on the
assumption that merging result lists are produced by
2. Related work diverse search engines. A simplest normalizing approach
is to divide each score by the maximum score of the topic
There are two types of translation methods in MLIR - on the current list. After adjusting scores, all results are
query translation and document translation [2]. Document sorted by the normalized score [10], [11]. Another method
translation can retrieve more accurate documents than is to divide difference between the score and maximum
query translation because the translation of long score by difference between maximum score and
documents may be more accurate in preserving the minimum score. This type of merging favours the scores
semantic meaning than the translation of short queries. which are near the best score of the topic on the list. This
Query translation is a general and easy search strategy. approach maps the scores of different result lists into the
same range, from 0 to 1, and makes the scores more
There are two architectures in MLIR [12]. In centralized comparable. But it has a problem. If the maximum score
architecture consists of a single document collection is much higher than the second one in a result list, the
containing document collections and a huge index file. It normalized-score of document at rank 2 would be low
needs one retrieving phase. Advantage of centralized even if its original score is high.
architecture is it avoids merging problem. Problem with
centralized architecture is the weights of index terms are System evaluation is measured by calculating gap between
over weighting. Thus, centralized architecture prefers system and user relevance. Due to Lack of control
small document collection. In distributed architecture, variables measuring the user centered approach is
different language documents are indexed in different becoming difficult. The motivation of our proposal is
indexes and retrieved separately. Several ranked document performance measurement can be examined by the
lists are generated by each retrieving phase. Obtaining a agreement or disagreement between the user and the
ranked list that contains documents in different languages system rankings.
from several text collections is critical; this problem is
solved by merging strategies. In any architecture problem New metric NDM is generated by considering the features
called language translation issues are raised. of below IR metrics.
In a distributed architecture, it is necessary to obtain a Discount Cumulated Gain (DCG): As rank gets
single ranked document list by merging the individual increased the importance of document gets decreased.
ranked lists that are in different languages. This issue is
known as merging strategy problem or collection fusion Normalized Distance-based Performance Measure
problem. Merging problem in MLIR is more complicated (NDPM): NDPM gives performance of MLIR system by
than the merging problem in monolingual environments comparing the order of ranking of two documents [1] [5].
because of the language barrier in different languages. NDPM is based on a preference relation on a finite set
of documents D is a weak order.
Following are some of the merging strategies.
Average Distance Measure (ADM): [3] ADM measures
Round-robin merging strategy: This approach is based the average distance between UREs (user relevance
on the idea that document scores are not comparable estimation) (the actual relevances of documents) and SREs
across the collections, each collection has approximately (system relevance estimation) (their estimates by the IRS)
the same number of relevant documents and the [2]. Drawback of ADM metric is low ranked documents
distribution of relevant documents is similar across the are given equal importance high ranked documents [3][1].
result lists [11]. The documents are interleaved according Problem with precision and recall is, they are highly
to ranking obtained for each document. sensitive to the thresholds. Instead of changing the
relevance, retrieval values suddenly, there should be a
continuous varying of relevance and retrieval.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 211
1.2
4. Experimental results 1
0.8
Phase 1 experiments show the importance of NDM metric. 0.6 ADM
Effectiveness of an Information Retrieval System (IRS) 0.4 NDPM
depends on relevance and retrieval. [2] States that 0.2
precision and recall are highly sensitive to the thresholds 0 NDM
chosen.
MLIR1
MLIR2
MLIR3
MLIR4
MLIR5
MLIR6
Table 3: Document scores in six MLIR systems
D1 D2 D3 D4 D5
USER 0.9 0.8 0.7 0.6 0.5
MLIR1 0.8 0.7 0.6 0.5 0.9 Fig. 1 The performance of NDM is compared with the ADM and NDPM
MLIR2 0.9 0.7 0.6 0.8 0.5
MLIR3 0.9 0.6 0.8 0.7 0.5 In the second phase of our experiments, we have measured
MLIR4 0.9 0.8 0.7 0.5 0.6 the NDM values for four merging technique of a MLIR
MLIR5 0.8 0.9 0.7 0.6 0.5 system. ADM value for the above MLIR system is 0.68
MLIR6 0.9 0.7 0.8 0.5 0.6 which remains constant for all 4 merging techniques. To
obtain the performance of merging mechanisms of an
Precision and recall are not continuous therefore precision MLIR we use NDM metric as follows. We took 9
and recall are not sensitive to important changes to MLIR documents from 3 languages and assigned document
systems like giving importance to top relevant documents. scores for 9 documents as shown in Table 5.
ADM and NDPM metrics are continuous metrics. Thus we
are comparing the NDM metric with ADM and NDPM. Table 5: Scores of 9 documents in three languages
Language 1 Language 2 Language 3
Table 4: Compare NDM with ADM and NDPM 1.9 0.4 1.2
ADM NDPM NDM 1.62 0.2 0.9
MLIR1 0.84 0.60 0.647 1.4 0.6
MLIR2 0.92 0.80 0.863 0.8
MLIR3 0.92 0.80 0.885
MLIR4 0.96 0.90 0.9507 We performed merging techniques for the above MLIR
MLIR5 0.96 0.90 0.9554 and the documents order is shown in the table 6. The
MLIR6 0.92 0.80 0.987 ADM and NDM values for four merging mechanisms are
shown in the Table 7.
Table 3 represents the six MLIR systems score list. The
scores of the document are converted into rankings to
obtain NDM and NDPM metrics. The drawbacks of the
ADM are stated in [3]. The drawbacks of ADM are
corrected in NDM. [3] states the importance of ranking in
performance measurement. Table 4 compares NDM metric
with ADM and NDPM.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 213
Intracranial segmentation commonly referred to as Figure 1: a) T1 MRI image with salt and paper noise, b) Median filtered
image.
brain extraction or skull stripping, aims to segment the
brain tissue (cortex and cerebellum) from the skull and
non-brain intracranial tissues in magnetic resonance
2.2. Brain Extraction: Threshold Morphologic Brain
(MR) images of the brain. Brain extraction is an Extraction (TMBE).
important pre-processing step in neuroimaging analyses
The goal of this phase is to extract the brain from the
because brain images must typically be skull stripped
acquired image: this will allow us to simplify the
before other processing algorithms such as registration,
segmentation of the brain tissues. Our easy and effective
or tissue classification can be applied. In practice, brain
method can be divided in five steps:
extraction is widely used in neuroimaging analyses such
as multi-modality image fusion and inter-subject image 2.2.1 Thresholding.
comparisons [12]; examination of the progression of
brain disorders such as Alzheimers Disease, multiple This step is based on global binary image thresholding
sclerosis and schizophrenia, monitoring the development using Otsu's method [19]. Figure 2-b shows a result of
or aging of the brain; and creating probabilistic atlases this operation.
from large groups of subjects. Numerous automated
skull-stripping methods have been proposed [13-18].
The rest of this paper is organised as follows: in the next 2.2.2 Greatest Connected Component Extraction.
section we describe our proposed method for Brain
Extraction from 2D MRI slices as pre-processing A survey based on a statistical analysis of the existing
procedure; in section 3 the standard clustering fuzzy c- connected components on the dilated image, permits to
means algorithm is sketched. Histogram based centroids extract the region whose area is the biggest. Figure 2-c
initialization is presented in section 4. The global shows a result of this operation.
proposed method of segmentation is presented in section
5. In section 6 we present different results obtained with 2.2.3 Filling the holes.
this method. Final conclusions and future works are
discussed in section 7. The remaining holes in the binary image obtained in step
2, containing the greatest connected component, are
filled using morphologic operation consisting of filling
2. Pre-processing. holes in the binary image. A hole is a set of background
pixels within connected component. The result of this
operation is shown in figure 2-d.
2.1. Filtering.
2.2.4. Dilatation.
This pre-processing stage performs a non linear mapping
of the grey level dynamics for the image. This transform This morphologic operation consists of eliminating all
consists in the application of a 3x3 median filter. The remaining black spots on the white surface of the image.
use of median filtering derives from the nature of the These spots are covered by the dilatation of the white
noise distribution in the MR images. The main source of parts. This carried out by moving a square structuring
noise in this kind of images is due to small density element of size (SxS) on binary image and applying
variations inside a single tissue which tend to locally logical OR operator on each of the (S2-1) neighbouring
modify the RF emission of the atomic nuclei during the pixels (figure 2-e). In this paper we consider S=3.
imaging process.
2.2.5 ROI Extracting.
The region of interest is the brain tissues. To extract this
region we use the AND operator between the original
filtered image and the binary mask obtained in last step.
The non-brain region is obtained by applying AND
operator between the image in figure 2-a and the logical
complement of mask image in figure 2-e.
a) b) c) d)
e) f) g) h)
Figure. 2: Brain Extraction steps on axial slice of number 84/181 in simulated data volume [21] with 5% uniform noise.
N
0 u ij N , i 1,..., C (2c)
3. Standard FCM algorithm. j 1
To reach a minimum of dissimilarity function there are
The fuzzy c-means (FCM) clustering algorithm was first two conditions.
introduced by DUNN [20] and later was extended by
N m
BEZDEK [8]. Fuzzy C-means (FCM) is a clustering j 1
uij x j
Vi (3)
technique that employs fuzzy partitioning such that a
N m
u
j 1 ij
data point can belong to all classes with different
membership grades between 0 and 1. 1
The aim of FCM is to find C cluster centers (centroids) uij 2 /( m1)
(4)
in the data set X=x1x2xNRp that minimize the c d
following dissimilarity function: k 1 d ij
c c n kj
J FCM Ji u ij
m
d 2 Vi , x j (1) This iterative algorithm is in the following steps.
i 1 i 1 j 1 Step 0. Randomly initialize the membership matrix (U)
uij : Membership of data xj in the cluster Vi; according to the constraints of Equations 2a,
Vi : Centroid of cluster i; 2b and 2c, Choose fuzzification parameter m
d(Vi,xj) : Euclidian distance between ith centroid (Vi) and jth 1 m , Choose the number of clusters
data point xj; C, Choose the initial values of cluster centers
m [1,] : Fuzzy weighting exponent (generally equals
2). V ( 0) and threshold >0.
N: Number of data. At iteration Ni
C: Number of clusters 2 C < N. {
p: Number of features in each data. Step 1. Calculate centroids vector (VNi) using
Equation (3).
With the constraints: Step 2. Compute dissimilarity function JNi using
equation (1). If its improvement over
u ij 0,1, i, j (2a) previous iteration is below a threshold , Go
C
to Step 4.
u
i 1
ij 1, j 1,..., N (2b) Step 3. Compute a new membership matrix
using Equation (4). Go to Step 1.
(UNi)
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 218
800 800
700 700
600
600
500
500
400
400
300
300
200
200
100
100
0
0
0 50 100 150 200 250 0 50 100 150 200 250 300
a) b)
900 800
800 700
700
600
600
500
500
400
400
300
300
200
200
100 100
0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300
c) d)
Figure. 3: Histogram analysis for centroids initialisation, a) Histogram of the image in figure 4-a, b) Smoothed histogram, c) Histogram of the extracted
brain tissues in image of figure 4-d, d) Smoothed histogram.
a) b) c) d)
e) f) g) h)
k) l) m) n)
Figure. 4: Example of segmentation results. a)-d) Results of brain extraction proposed method.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 220
e) Segmented image by the proposed method, f) Cerebrospinal fluid (CSF), g) Gray matter (GM) and h) White matter (WM).
k) Truth Verity image, l), m) and n) Manual segmentation of the same brain tissues (Brainweb).
180 10000
160 9000
CSF
GM
8000
140 WM
7000
120
6000 CSF
GM
100
WM
5000
80
4000
60 3000
40 2000
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
a) b)
Ni Ni
6 4
x 10 x 10
1.92 0
1.9 -2 Err
ObjFcn
1.88
-4
1.86
-6
1.84
-8
1.82
-10
1.8
1.78 -12
1.76 -14
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 2 2.5 3 3.5 4 4.5 5 5.5 6
c) d)
Ni Ni
The effectiveness of the method was tested on simulated a good contrast between gray (GM) and white cerebral
MR images to extract the well known clusters (truth matter (WM) as well as between GM and cerebrospinal
verity). Figure .3 shows the results of histogram analysis fluid (CSF). The advantages of using digital
leading to a centroids initialisation of the extracted
region of interest consisting of brain tissues that we want simulated images rather than real image data for
segment. It is about a sagital T1-weighted slice number validating segmentation methods is that it include
120/181 in sagital direction of TALAYRACH steriotaxic prior knowledge of the true tissues types.
reference (volume of 181x217x181 voxels [21]).
Comparison between figure .5 and figure .6 shows the
Figure .4 shows an example of qualitative evaluation of
effectiveness of the proposed method. Indeed, in figure.
our segmentation results with the provided manually
5 we show that when we give adequate initially
segmentation results by the web site [21] corresponding
centroids, the iteratively clustering algorithm converge
to the same slice described above.
rapidly toward the effective clusters in the image (65,
124, and 166) in approximately about 6 iterations and
The segmentation aims to divide the image in three
with simple pace of the curves, but when the
clusters: White matter (WM), gray matter (GM), and
initialisation is made so far from the adequate values of
cerebrospinal fluid (CSF). The background pixels are
the desired clusters, the convergence of the clustering
removed from the image by thresholding (binarisation)
algorithm is very slow, as is shown in figure 6
before the clustering starts.
(approximately about 21 iterations), what gives a gain of
approximately 70% in time processing. In addition, the
T1-weighted modality, that belong to the fastest MRI
modalities available, are often preferred, since they offer
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 221
180 12000
160 10000
CSF
GM
140 WM 8000
80 2000
60 0
0 5 10 15 20 25 0 5 10 15 20 25
a) b)
Ni Ni
6 5
x 10 x 10
8 0
OblFcn -2
7
Err
-4
6
-6
5 -8
4 -10
-12
3
-14
2
-16
1 -18
0 5 10 15 20 25 2 4 6 8 10 12 14 16 18 20 22
c) d)
Ni Ni
[2] G. Dong, M. Xie, Color clustering and learning for [16] F. Segonne, A. M. Dale, E. Busa, M. Glessner, D.
image segmentation based on neural networks, IEEE Salat, H. K. Hahn, and B. Fischl, A hybrid
Transactions on Neural Networks 16(4), 2005, pp. approach to the skull stripping problem in MRI,
925936. NeuroImage, vol. 22, pp :1060-75, 2004.
[3] R.M. Haralick, L.G. Shapiro, Image segmentation [17] S. M. Smith, Fast robust automated brain
techniques, Computer Vision, Graphics and Image extraction, Human Brain Mapping, vol. 17, pp:
Processing 29(1), 1985, pp. 100132. 143-55, 2002.
[4] N.R. Pal, S.K. Pal, A review on image [18] D.W. Shattuck, S.R. Sandor-Leahy, K.A. Shaper,
segmentation techniques, Pattern Recognition D.A. Rottenberg, R.M. Leahy, Magnetic
26(9), 1993, pp. 12771294. resonance image tissue classification using a partial
[5] D.L. Pham, C. Xu, J.L. PRINCE, Current volume model. NeuroImage. 13 (5), 856876.
methods in medical image segmentation, Annual 2001.
Review of Biomedical Engineering 2(1), 2000, pp. [19] N. Otsu, A Threshold Selection Method from
315338. Gray-Level Histograms. IEEE Transactions on
[6] WEINA WANG, YUNJIE ZHANG, YI LI, Systems, Man, and Cybernetics, Vol. 9, No. 1,
XIAONA ZHANG, The global fuzzy c-means 1979, pp. 62-66.
clustering algorithm, In Proceedings of the World [20] J.C. Dunn, A fuzzy relative of the ISODATA
Congress on Intelligent Control and Automation, process and its use in detecting compact well-
Vol. 1, 2006, pp. 36043607. separated clusters, Journal of Cybernetics 3(3),
[7] L.A. Zadeh, Fuzzy sets Information and Control, 1973, pp. 3257.
Vol. 8, 1965, pp. 338353. [21] http://www.bic.mni.mcgill.ca/brainweb
[8] J.C. Bezdek, Pattern Recognition with Fuzzy
Bouchaib CHERRADI has born in 1970 at El JADIDA, Morocco.
Objective Function Algorithms, Plenum Press, Received the B.S. degree in Electronics in 1990 and the M.S. degree
New York 1981. in Applied Electronics in 1994 from the ENSET Institute,
[9] J.C. Bezdek, L.O. Hall, L.P. Clarke, Review of Mohammedia, Morocco. He received the DESA diploma in
MR image segmentation techniques using pattern Instrumentation of Measure and Control from the University of EL
JADIDA in 2004. He is now a Ph.D. student in MCM&SCP
recognition, Medical Physics 20(4), 1993, pp. laboratory, Faculty of Science and Technology, Mohammedia. His
10331048. research interests include Massively Parallel Architectures, Cluster
[10] N. Ferahta, A. Moussaoui, K. Benmahammed, Analysis, Pattern Recognition, Image Processing and Fuzzy Logic.
V.Chen, New fuzzy clustering algorithm applied to
RMN image segmentation, International Journal of Omar BOUATTANE has born in 1962 at FIGUIG, south of
Morocco. He has his Ph.D. degree in 2001 in Parallel Image
Soft Computing 1(2), 2006, pp. 137142. Processing on Reconfigurable Computing Mesh from the Faculty of
[11] B.Cherradi and O.Bouattane. Fast fuzzy Science Ain Chock, CASABLANCA. He has published more than 30
segmentation method of noisy MRI images research publications in various National, International conference
including special information. In the proceeding of proceedings and Journals. His research interests include Massively
Parallel Architectures, cluster analysis, pattern recognition, image
ICTIS07 IEEE Morocco section, ISBN 9954-8577- processing and fuzzy logic.
0-2, Fez, 3-5 Mars 2007, p 461-464.
[12] R. P. Woods, M. Dapretto, N. L. Sicotte, A. W. Mohamed YOUSSFI has born in 1970 at OUARZAZATE,
Toga, and J. C. Mazziotta, Creation and use of a Morocco. He is now a teacher of computer science and researcher at
the University Hassan II Mohammedia, ENSET Institute.
Talairach-Compatible atlas for accurate, automated, His research is focused on parallel and distributed computing
nonlinear intersubject registration, and analysis of technologies, Grid Computing and Middlewares. Received the B.S.
functional imaging data, Human Brain Mapping, degree in Mechanics in 1989 and the M.S. degree in Applied
vol. 8, pp: 73-79, 1999. Mechanics in 1993 from the ENSET Institute, Mohammedia,
Morocco. He received the DEA diploma in Numeric Analysis from
[13] AM. Dale, B. Fischl, and MI. Sereno, Cortical the University of RABAT in 1994. He received the Doctorate
surface-based analysis. Segmentation and surface diploma in Computing and Numeric Analysis from the University
reconstruction, NeuroImage, vol. 9, pp: 179-194, MOHAMMED V of RABAT in 1996.
1999.
Abdelhadi RAIHANI has born in 1968 at El Jadida, Morocco. He is
[14] H. Hahn, and H-O. Peitgen, The skull stripping now a teacher of Electronics and researcher at ENSET Institute. His
problem in MRI solved by a single 3D watershed research is focused on parallel architectures and associated
transform, MICCAI, vol. 1935, pp: 134-143, 2000. treatments. Recently, he worked on Wind Energy. Received the B.S.
[15] S. Sandor, and R. Leahy, Surface-based labeling degree in Electronics in 1987 and the M.S. degree in Applied
Electronics in 1991 from the ENSET Institute, Mohammedia,
of cortical anatomy using a deformable database, Morocco. He received the DEA diploma in information processing
IEEE Transactions on Medical Imaging, vol. 16, pp: from the Ben Msik University of Casablanca in 1994. He received
41-54, 1997. the Doctorate diploma in Application of Parallel Architectures in
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 224
2
Faculty of Business, Lebanese University
Lebanon
Abstract
Round Robin, considered as the most widely adopted CPU The dispatcher is the module that gives control of the CPU
scheduling algorithm, undergoes severe problems directly related to the process selected by the short-term scheduler [8].
to quantum size. If time quantum chosen is too large, the
response time of the processes is considered too high. On the
other hand, if this quantum is too small, it increases the overhead
of the CPU.
In this paper, we propose a new algorithm, called AN, based on a
new approach called dynamic-time-quantum; the idea of this
approach is to make the operating systems adjusts the time
quantum according to the burst time of the set of waiting
processes in the ready queue.
Based on the simulations and experiments, we show that the new
proposed algorithm solves the fixed time quantum problem and
increases the performance of Round Robin.
Keywords: Operating Systems, Multi Tasking, Scheduling
Algorithm, Time Quantum, Round Robin.
priority process arrives. Non preemptive algorithms are based etc), many researchers have tried to fill this gap,
used where the process runs to complete its burst time but still much less than needs.
even a higher priority process arrives during its execution
time. Matarneh [2] founded that an optimal time quantum could
be calculated by the median of burst times for the set of
First-Come-First-Served (FCFS)[8, 9] is the simplest processes in ready queue, unless if this median is less than
scheduling algorithm, it simply queues processes in the 25ms. In such case, the quantum value must be modified
order that they arrive in the ready queue. Processes are to 25ms to avoid the overhead of context switch time [2].
dispatched according to their arrival time on the ready Other works [7], have also used the median approach, and
queue. Being a non preemptive discipline, once a process have obtained good results.
has a CPU, it runs to completion. The FCFS scheduling is
fair in the formal sense or human sense of fairness but it is Helmy et al. [3] propose a new weighting technique for
unfair in the sense that long jobs make short jobs wait and Round-Robin CPU scheduling algorithm, as an attempt to
unimportant jobs make important jobs wait [8, 9]. combine the low scheduling overhead of round robin
algorithms and favor short jobs. Higher process weights
Shortest Job First (SJF) [8, 9] is the strategy of arranging means relatively higher time quantum; shorter jobs will be
processes with the least estimated processing time given more time, so that they will be removed earlier from
remaining to be next in the queue. It works under the two the ready queue [3]. Other works have used mathematical
schemes (preemptive and non-preemptive). Its provably approaches, giving new procedures using mathematical
optimal since it minimizes the average turnaround time theorems [4].
and the average waiting time. The main problem with this
discipline is the necessity of the previous knowledge about Mohanty and others also developed other algorithms in
the time required for a process to complete. Also, it order to improve the scheduling algorithms performance
undergoes a starvation issue especially in a busy system [5], [6] and [7]. One of them is constructed as a
with many small processes being run [8, 9]. combination of priority algorithm and RR [5] while the
other algorithm is much similar to a combination between
Round Robin (RR) [8, 9]which is the main concern of this SJF and RR [6].
research is one of the oldest, simplest and fairest and most
widely used scheduling algorithms, designed especially
for time-sharing systems. Its designed to give a better 3. AN Algorithm
responsive but the worst turnaround and waiting time due
to the fixed time quantum concept. The scheduler assigns In this paper, we present a solution to the time quantum
a fixed time unit (quantum) per process usually 10-100 problem by making the operating system adjusts the time
milliseconds, and cycles through them. RR is similar to quantum according to the burst time of the existed set of
FCFS except that preemption is added to switch between processes in the ready queue.
processes [2, 3, and 8].
3.1 Methodology
In this paper, we propose a new algorithm to solve the
When operating system is installed for the first time, it
constant time quantum problem. The algorithm is based on
begins with time quantum equals to the burst time of first
dynamic time quantum approach where the system adjusts
dispatched process, which is subject to change after the
the time quantum according to the burst time of processes
end of the first time quantum. So, we assume that the
founded in the ready queue. The second section states
system will immediately take advantage of this method.
some of previous works done in this field. Section III
The determined time quantum represents real and optimal
describes the proposed method in details. Section IV
value because it based on real burst time unlike the other
discusses the simulation done in this method, before
methods, which depend on fixed time quantum value.
concluding this paper in the last section.
Repeatedly, when a new process is loaded into the ready
queue in order to be executed, the operating system
2. Previous works calculates the average of sum of the burst times of
processes found in the ready queue including the new
Round Robin becomes one of the most widely used arrival process.
scheduling disciplines despite of its severe problem which This method needs two registers to be identified:
rose due to the concept of a fixed pre-determined time - SR: Register to store the sum of the remaining burst
quantum [2, 3, 4, 5, 6, and 7]. Since RR is used in almost times in the ready queue.
every operating system (windows, BSD, UNIX and Unix-
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 226
These figures clearly show that for all the tested cases, we Case 2: Assume four processes arrived at time = 0, with
obtain better results (lower TAT and WT) when using the burst time (P1 = 10, P2 = 14, P3 = 70, P4 = 120):
AN algorithm. Fixed Dynamic AN
Quantum=20ms method [2]
Turn-around time 100.5 96 85.5
Waiting time 47 42.5 32
Context switch 11 6 5
Table 1: Improvement percentage of AN The significant decrease of the number of processes will
TQ %I(wt[TQ]) %I(tat[TQ]) inevitably lead to significant reduction in the number of
10ms 20.1162 20.1162 context switches, which may pose high overhead on the
15ms 16.1163 16.1162 operating system in many cases. The number of context
20ms 13.8562 13.8562 switches can be represented mathematically as follows:
25ms
30ms
12.6113
10.4413
12.6112
10.4412 QT K 1
r
1 r
Where:
4.3 Success in Statistics QT = the total number of context switch
r = the total number of rounds, r = 1, 26
In addition to the improvement measure (%I), we added kr = the total number of processes in each round
another measure of success over failure which is
calculated by percentage of success samples over the In other variants of round robin scheduling algorithm, the
failed ones. A succeed sample is sample where vertex of context switch occurs even if there is only a single process
AN algorithm is less than vertex of RR. in the ready queue, where the operating system assigns to
S= ((number of succeed samples) / (total number of the process a specific time quantum Q[4]. When time
samples)) we obtained the following results (table 2). quantum expires, the process is interrupted and again
assigned the same time quantum Q, regardless whether the
Table 2: Success over failure percentage of AN process is alone in the ready queue or not [2, 3], which
TQ %S(tat[TQ]) %S(wt[TQ]) means that there will be additional unnecessary context
switches, while this problem does not occur at all in our
10ms 96% 96%
new proposed algorithm; because in this case, the time
15ms 92% 90%
20ms 90% 88% quantum will equal to the remaining burst time of the
25ms 88% 88% process.
30ms 86% 84%
5. Conclusion
4.4 Improvement in Context Switches
Time quantum is the bottleneck facing round robin
As a result of our observations, 50% of the processes will algorithm and was more frequently asked question: What
be terminated through the first round and as time quantum is the optimal time quantum to be used in round robin
is calculated repeatedly for each round, then 50% of the algorithm?
remaining processes will be terminated during the second In light of the effectiveness and the efficiency of the RR
round, with the same manner for the third round, fourth algorithm, this paper provides an answer to this question
round etci.e., the maximum number of rounds will be by using dynamic time quantum instead of fixed time
less than or equal to 6 whatever the number of processes quantum, where the operating system itself finds the
or their burst time (fig4). [2] optimal time quantum without user intervention.
In this paper, we have discussed the AN algorithm that
could be a simple step for a huge aim in obtaining an
optimal scheduling algorithm. It will need much more
efforts and researches to score a goal.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 229
References
[1] Weiming Tong, Jing Zhao, Quantum Varying Deficit Round
Robin Scheduling Over Priority Queues, International
Conference on Computational Intelligence and Security. pp.
252- 256, China, 2007.
[2] Rami J. Matarneh, Self-Adjustment Time Quantum in
Round Robin Algorithm Depending on Burst Time of the
Now Running Processes, American Journal of Applied
Sciences, Vol 6, No. 10, 2009.
[3] Tarek Helmy,Abdelkader Dekdouk, Burst Round Robin as a
Proportional-Share Scheduling Algorithm, In Proceedings
of The fourth IEEE-GCC Conference on Towards Techno-
Industrial Innovations, pp. 424-428, Bahrain, 2007.
[4] Samih M. Mostafa, S. Z. Rida, Safwat H. Hamad, Finding
Time Quantum Of Round Robin Cpu Scheduling Algorithm
In General Computing Systems Using Integer Programming,
International Journal of Research and Reviews in Applied
Sciences (IJRRAS), Vol 5, Issue 1, 2010.
[5] Rakesh Mohanty, H. S. Beheram Khusbu Patwarim Monisha
Dash, M. Lakshmi Prasanna , Priority Based Dynamic
Round Robin (PBDRR) Algorithm with Intelligent Time
Slice for Soft Real Time Systems, (IJACSA) International
Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011.
[6] Rakesh Mohanty, H. S. Behera, Khusbu Patwari, Monisha
Dash, Design and Performance Evaluation of a New
Proposed Shortest Remaining Burst Round Robin (SRBRR)
Scheduling Algorithm, In Proceedings of International
Symposium on Computer Engineering & Technology
(ISCET), Vol 17, 2010.
[7] Rakesh Mohanty, H. S. Behera, Debashree Nayak, A New
Proposed Dynamic Quantum with Re-Adjusted Round Robin
Scheduling Algorithm and Its Performance Analysis,
International Journal of Computer Applications (0975
8887), Volume 5 No.5, August 2010.
[8] Silberschatz ,Galvin and Gagne, Operating systems concepts,
8th edition, Wiley, 2009.
[9] Lingyun Yang, Jennifer M. Schopf and Ian Foster,
Conservative Scheduling: Using predictive variance to
improve scheduling decisions in Dynamic Environments,
SuperComputing 2003, November 15-21, Phoenix, AZ, USA.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 230
Finally we present the evaluation results and the main method. We use PCA with coefficients vectors instead of
conclusions. pixels vectors. We notice that this technique requires more
time than PCA (because of the calculation of the
Face
Face coefficients) in particular with data bases of average or
Identification matching reduced size but it should be noted that it requires less
Score
memory what makes its use advantageous with bases of
significant size.
Speaker Speech
Identification matching
System Score 2.2 Experimental Results
The tests were performed by using the image data bases
Fusion ORL, Yale Faces and BBAFaces. The latter was created at
Modul
the University Center of Bordj Bou Arreridj in 2008. It is
Fig. 1. User access scenario based on speech and face composed by 23 people with 12 images for each one of
Information. them (for the majority of the people, the images were
Accept/ taken during various sessions). The images reflect various
Reject facial expressions with different intensity variations and
2. Face Recognition different light sources. To facilitate the tests, the faces
were selected thereafter manually in order to get images of
This paper uses a hybrid method combining principal 124 X 92 pixels, we then convert them into gray levels
components analysis (PCA) [11] and the discrete cosine and store them with JPG format. Fig. 3. represents a
transform (DCT) [12] for face identification [13]. typical example of the data. It should be noted that certain
categories of this data are not retained for the tests.
Extraction of
Images from
information from Each
Training Data Base
Image
Calculus
Saving
Training (a) (b) (c) (d) (e) (f)
Phase
Saving
Extracted
Images
(g) (h) (i) (j) (k) (l)
Identification Phase
Fig. 3. Example from BBAFaces. (a): normal, (b): happy,
Input Detection and Calculation of a (c): glasses, (d): sad, (e): sleepy, (f): surprised, (g): wink,
Image Normalisation metric distance (h): dark, (i): top light, (j): bottom light, (k): left light, (l):
D(Pi, P1)
D(Pi, P2) right light.
.
In the following we will expose the results obtained for
Result D(Pi, Pm) the tests realized with Yale Faces and BBA Faces.
Best
Score
Table 1: Rates of Recognition
3. Speaker Recognition System or identification in an open unit for which the speaker to
be identified does not belong inevitably to this unit [16].
Nowadays The Automatic Treatment of speech is
progressing, in particular in the fields of Automatic
Speech Recognition "ASR" and Speech Synthesis. Automatic Speaker Identification :
The automatic speaker recognition is represented like a
particular pattern recognition task. It associates the
problems relating to the speaker identification or Speaker1
verification using information found in the acoustic signal: Reference
Systems with free text "or free-text": the speaker is constitutes the state of the art in ASR. The decision of an
free to pronounce what he wants. In this mode, the automatic speaker recognition system is based on the two
sentences of training and test are different. processes of speaker identification and/or checking
Systems with suggested text "or text-prompted": a whatever the application or the task is concerned with.
text, different on each session and for each person,
is imposed to the speaker and is determined by the
machine. The sentences of training and test can be
different. 4. Performance of Biometric Systems
Systems dependent on the vocabulary "or
vocabulary-dependent": the speaker pronounces a The most significant and decisive argument which makes
sequence of words resulting from a limited the difference between a biometric system and another is
vocabulary. In this mode, the training and the test its error rate, a system is considered ideal if its:
are carried out on texts made up and starting from
the same vocabulary. False Rejection Rate= False Acceptance Rate= 0;
Personalized systems dependent on the text (or to
use-specific text dependent): each speaker has his
own password. In this mode, the training and the
test are carried out on the same text.
The vocal message makes the task of ASR systems
easier and the performances are better. The recognition in
text mode independent requires more time than the text
mode dependent [17].
P(FA1)=0.1.
P (FR1)=0.6.
In the speaker recognition system we obtained:
P(FA2)=0.3.
P (FR2)=0.2.
1. Main Interface
4. Verification Process
5. IdentificationProcess
3. Acquisition Module for Speaker
7. Conclusions
This paper provides results obtained on a multi-modal
biometric system that uses face and voice features for
recognition purposes. We used fusion at the decision level
with OR and AND operators. We showed that the
resulting system (multi-modal) considered here provide
better performance than the individual biometrics. For the
near future we are collecting data corresponding to three
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 236
biometric indicators - fingerprint, face and voice in order Application of the DCT Energy Histogram for Face
to conceive a better multi-modal recognition system. Recognition. 2nd International Conference on Information
Technology for Application (ICITA 2004) PP 305-310
[13] Samir Akrouf, Sehili Med Amine, Chakhchoukh
Acknowledgments Abdesslam, Messaoud Mostefai and Youssef Chahir
2009 Fifth International Conference on Mems Nano
Special thanks to Benterki Mebarka and Bechane Louiza and Smart Systems 28-30 December 2009 Dubai UAE.
for their contribution to this project. [14] N Morizet, Thomas Ea, Florence Rossant, Frdric Amiel
Samir Akrouf thanks the Ministry of Higher Education for Et Amara Amara, Revue des algorithmes PCA, LDA et
the financial support of this project (project code: EBGM utiliss en reconnaissance 2D du visage pour la
biomtrie, Tutoriel Reconnaissance d'images, MajecStic
B*0330090009 ) .
2006 Institut Suprieur dElectronique de Paris (ISEP).
[15] Akrouf Samir, Mehamel Abbas, Benhamouda
References Nacra, Messaoud Mostefai
[1] A. K. Jain, R. Bolle, and S. Pankanti, Biometrics: Personal An Automatic Speaker Recognition System, 2009 the 2nd
Identification in Networked Society. Boston, MA: Kluwer, International Conference on Advanced Computer Theory
1998. Engineering (ICACTE 2009) Cairo, Egypt September 25-
[2] A. K. Jain, S. Prabhakar, and S. Chen, \Combining multiple 27 2009
matchers for a high security fingerprint verification system," [16] Approche Statistique pour la Reconnaissance Automatique
Pattern Recognition Letters, vol. 20, pp. 1371-1379, 1999. du Locuteur : Informations Dynamiques et Normalisation
[3] R. Brunelli and D. Falavigna, Person identification using Bayesienne des Vraisemblances , October, 2000.
Multiple cues, IEEE Trans. Pattern Anal. Machine Intell., [17] Yacine Mami Reconnaissance de locuteurs par localisation
vol. 17, pp. 955966, Oct. 1995. dans un espace de locuteurs de rfrence Thse de
[4] B. Duc, G. Maitre, S. Fischer, and J. Bigun, Person doctorat, soutenue le 21 octobre 2003.
Authentication by fusing face and speech information, in 1st
Int. Conf. Audio- Video- Based Biometric Person Samir Akrouf was born in Bordj Bou Arrridj, Algeria in 1960. He
Authentication AVBPA97, J. Bigun, G. Chollet, and G. received his Engineer degree from Constantine University, Algeria
Borgefors, Eds. Berlin, Germany: Springer-Verlag, in 1984. He received his Masters degree from University of
Minnesota, USA in 1988. Currently; he is an assistant professor at
Mar. 1214, 1997, vol. 1206 of Lecture Notes in Computer
the Computer department of Bordj Bou Arrridj University, Algeria.
Science, pp. 311318. He is an IACSIT member and is a member of LMSE laboratory (a
[5] E. Bigun, J. Bigun, B. Duc, and S. Fischer, Expert research laboratory in Bordj Bou Arrridj University). He is also the
conciliation for multi modal person authentication systems director of Mathematics and Computer Science Institute of Bordj
by Bou Arrridj University. His main research interests are focused on
Bayesian statistics, in Proc. 1st Int. Conf. Audio-Video- Biometric Identification, Computer Vision and Computer Networks.
Based Biometric Person Authentication AVBPA97. Berlin,
Germany: Springer-Verlag, Lecture Notes in Computer Yahia Belayadi was born in Bordj Bou Arrridj, Algeria in 1961. He
received his Engineer degree from Setif University Algeria in 1987.
Science, 1997, pp. 291300.
He received his magister from Setif University Algeria in 1991.
[6] L. Hong and A. K. Jain, Integrating faces and fingerprint for Currently; he is an assistant professor at the Computer department
Personal identification, IEEE Trans. Pattern Anal. Machine of Bordj Bou Arrridj University, Algeria. He also is the director of
Intell., vol. 20, 1997. University Center of Continuous Education in Bordj Bou Arreridj.
[7] J. Kittler, M. Hatef, R. P. W. Duin, and J. Matas, On
Combining classifiers, IEEE Trans. Pattern Anal. Machine Messaoud Mostefai was born in Bordj Bou Arrridj, Algeria in
Intell., vol. 20, pp. 226239, 1998. 1967. He received his Engineer degree from Algiers University, Algeria
[8] A. K. Jain, L. Hong, and Y. Kulkarni, A multimodal in 1990. He received a DEA degree en Automatique et Traitement
biometric system using fingerprints, face and speech, in Numrique du Signal (Reims - France) in 1992. He received his doctorate
degree en Automatique et Traitement Numrique du Signal (Reims -
Proc. 2nd Int. Conf. Audio-Video Based Biometric Person
France) in 1995. He got his HDR Habilitation Universitaire : Theme :
Authentication, Washington, D.C., Mar. 2223, 1999, Adquation Algorithme /Architecture en traitement dimages in
pp. 182187. (UFAS Algeria) in 2006. Currently; he is a professor at the Computer
[9] T. Choudhury, B. Clarkson, T. Jebara, and A. Pentland, department of Bordj Bou Arrridj University, Algeria. He is a member of
Multimodal person recognition using unconstrained audio LMSE laboratory (a research laboratory in Bordj Bou Arrridj
and video, in Proc. 2nd Int. Conf. Audio-Video Based University). His main research interests are focused on classification and
Person Authentication, Washington, D.C., Mar. 2223, Biometric Identification, Computer Vision and Computer Networks.
1999, pp. 176180.
[10] S. Ben-Yacoub, Multimodal data fusion for person Youssef Chahir is an Associate Professor (since '00) at GREYC
authentication using SVM, in Proc. 2nd Int. Conf. Audio- Laboratory CNRS UMR 6072, Department of Computer Science,
University of Caen Lower-Normandy France.
Video Based Biometric Person Authentication, Washington,
D.C., Mar. 2223, 1999, pp. 2530.
[11] M. Turk and A. Pentland. Eigenfaces for recognition.
Journal of Cognitive Science, pages 7186, 1991.
[12] Ronny Tjahyadi, Wanquan Liu, Svetha Venkatesh.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 237
Rafik Mahdaoui 1,2, Leila Hayet Mouss1 , Mohamed Djamel Mouss 1, Ouahiba Chouhal 1,2
1 Laboratoire dAutomatique et Productique (LAP) Universit de Batna,
Rue Chahid Boukhlouf 05000 Batna, Algrie
1,2 Centre universitaire Khenchela Algrie,
Route de Batna BP:1252, El Houria, 40004 Khenchela Algrie
Time
Several of the existing approaches used ANNs to model
Time
From causes to the systems and estimate the RUL. Zhang and Ganesan
effects: prognosiss
[14] used self-organizing neural networks for
Time
multivariable trending of the fault development to estimate
the residual life of bearing system. Wang and
Figure 1. detection ,diagnosis and prognosis- the phenomenological Vachtsevanos [13] proposed an architecture for prognosis
aspect applied to industrial chillers. Their prognostic model
included dynamic wavelet neural networks, reinforcement
The definition of failure is simply defined that the learning, and genetic algorithms. This model was used to
failure occurs when the fault reaches a predetermined predict the failure growth of bearings based on the
level. The second one builds a model for the failure vibration signals. SOM and back propagation neural
mechanism using available historical data. In this case, networks (BPNN) methods using vibration signals to
different definitions of failure can be defined as follows: predict the RUL of ball bearing were applied by Huang et
(a) an event that the machine is operating at an al. in [12].
unsatisfactory level; or (b) it can be a functional failure
when the machine cannot perform its intended function at
all; or (c) it can be just a breakdown when the machine
Increseang cost and accuracy
neural network for predicting the machine condition trend. procedure. One possible method to overcome this problem
Dong et al. [16] employed a grey model and a BPNN to can be to find the antecedents & rules separately e.g.
predict the machine condition. Altogether, the data-driven clustering and constrain the antecedents, and then apply
techniques are the promising and effective techniques for optimization.
machine condition prognosis. Hierarchical NF networks can be used to overcome the
dimensionality problem by decomposing the system into a
series of MISO and/or SISO systems called hierarchical
2. Temporal Neuro-Fuzzy Systems systems [14]. The local rules use subsets of input spaces
and are activated by higher level rules[12].
Fuzzy neural network (FNN) approach has become a The criteria on which to build a NF model are based on
powerful tool for solving real-world problems in the area the requirements for faults diagnosis and the system
of forecasting, identification, control, image recognition characteristics. The function of the NF model in the FDI
and others that are associated with high level of scheme is also important i.e. Preprocessing data,
uncertainty [2,7,10,11,14,23,24,23] Identification (Residual generation) or classification
(Decision Making/Fault Isolation).
The Neuro-fuzzy model combines, in a single For example a NF model with high approximation
framework, both numerical and symbolic knowledge about capability and disturbance rejection is needed for
the process. Automatic linguistic rule extraction is a useful identification so that the residuals are more accurate.
aspect of NF especially when little or no prior knowledge Whereas in the classification stage, a NF network with
about the process is available [3]. For example, a NF more transparency is required.
model of a non-linear dynamical system can be identified The following characteristics of NF models are
from the empirical data. important:
This model can give us some insight about the on Approximation/Generalisation capabilities
linearity and dynamical properties of the system. transparency: Reasoning/use of prior knowledge /rules
The most common NF systems are based on two types Training Speed/ Processing speed
of fuzzy models TSK [5] [7] combined with NN learning Complexity
algorithms. TSK models use local linear models in the Transformability: To be able to convert in other forms
consequents, which are easier to interpret and can be used of NF models in order to provide different levels of
for control and fault diagnosis [23]. Mamdani models use transparency and approximation power.
fuzzy sets as consequents and therefore give a more Adaptive learning
qualitative description. Many Neuro-fuzzy structures have Two most important characteristics are the generalising
been successfully applied to a wide range of applications and reasoning capabilities. Depending on the application
from industrial processes to financial systems, because of requirement, usually a compromise is made between the
the ease of rule base design, linguistic modeling, and above two.
application to complex and uncertain systems, inherent In order to implement this type of Neuro-Fuzzy
non-linear nature, learning abilities, parallel processing Systems For Fault Diagnosis and Prognosis and exploited
and fault-tolerance abilities. However, successful to diagnose of dedicated production system we have to
implementation depends heavily on prior knowledge of the propose data-processing software NEFDIAG (Neuro-
system and the empirical data [25]. Fuzzy Diagnosis).
Neuro-fuzzy networks by intrinsic nature can handle The Takagi-Sugeno type fuzzy rules are discussed in
limited number of inputs. When the system to be identified detail in Subsection A. In Subsection B, the network
is complex and has large number of inputs, the fuzzy rule structure of FENN is presented.
base becomes large.
NF models usually identified from empirical data are 2.1 Temporal Fuzzy rules
not very transparent. Transparency accounts a more
meaningful description of the process i.e less rules with Recently, more and more attention has paid to the
appropriate membership functions. In ANFIS [2] a fixed Takagi-Sugeno type rules [9] in studies of fuzzy neural
structure with grid partition is used. Antecedent and networks. This significant inference rule provides an
consequent parameters are identified by a combination of analytic way of analyzing the stability of fuzzy control
least squares estimate and gradient based method, called systems. If we combine the Takagi-Sugeno controllers
hybrid learning rule. This method is fast and easy to together with the controlled system and use state-space
implement for low dimension input spaces. It is more equations to describe the whole system [10], we can get
prone to lose the transparency and the local model another type of rules to describe nonlinear systems as
accuracy because of the use of error back propagation that below:
is a global and not locally nonlinear optimization
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 240
1 1 4
Where .. is the inner is
the inner state vector of the nonlinear system,
.. is the input vector to
the system, and N, M are the dimensions;
, are linguistic terms (fuzzy sets) defining the
conditions for xi and uj respectively, according to Rule r; Where ,
is a matrix of and
Using equation (4), the system state transient equation,
Of we can calculate the next state of system by current state
and input.
When considered in discrete time, such as modeling
using a digital computer, we often use the discrete state-
space equations instead of the continuous version.
Concretely, the fuzzy rules become: 2.2 The structure of temporal Neuro-Fuzzy System
Rule r:
The main idea of this model is to combine simple feed
forward fussy systems to arbitrary hierarchical models.
The structure of recurrent Neuro-fuzzy systems is
1 presented in figure 3:
delay
Where ..
is the discrete sample of state vector at discrete time t.
In following discussion we shall use the latter form of 1
,
rules.
In both forms, the output of the system is always
defined as: ,
, 1 Normal neuron
(2) Threshold neuron
In this network, input nodes which accept the little specialty, its weights of links from layer 4 are
environment inputs and context nodes which copy the
matrices Ar (to node for A) and Br (to node for B). It is
value of the state- space vector from layer 3 are all at layer
1 (the Input Layer). They represent the linguistic variables also fully connected with the previous layer. The functions
known as uj and xi in the fuzzy rules. Nodes at layer 2 act of nodes for A and B are respectively.
as the membership functions, translating the linguistic
variables from layer 1 into their membership degrees.
Since there may exist several terms for one linguistic , 6
variable, one node in layer 1 may have links to several
nodes in layer 2, which is accordingly named as the term Layer 3. the Linear System Layer has only one
nodes. The number of nodes in the Rule Layer (layer 3)
node, which has all the outputs of layer 1 and layer 2
and the one of the fuzzy rules are the same - each node
represents one fuzzy rule and calculates the firing strength connected to it as inputs. Using matrix form of inputs and
of the rule using membership degrees from layer 2. The output, we have [see (3)]
connections between layer 2 and layer 3 correspond with
the antecedent of each fuzzy rule. Layer 4, as the
Normalization Layer, simply does the normalization of the So the output of layer 3 is X(t + 1) in (4).
firing strengths. Then with the normalized firing strengths
hr , rules are combined at layer 5, the Parameter Layer, This proposed network structure implements the
where A and B become available. In the Linear System dynamic system combined by our discrete fuzzy rules and
Layer, the 6th layer, current state vector X(t) and input the structure of recurrent networks. With preset human
vector U(t) are used to get the next state X(t +1), which is
also fed back to the context nodes for fuzzy inference at knowledge, the network can do some tasks well. But it will
time (t +1). The last layer is the Output Layer, multiplying do much better after learning rules from teaching
X(t +1) with C to get Y(t +1) and outputting it. examples. In the next section, a learning algorithm will be
put forth to adjust the variable parameters in FENN,
Next we shall describe the feed forward procedure of
TNFS by giving the detailed node functions of each layer, such as cr, sr, Ar, Br, and C.
taking one node per layer as example. We shall use
notations like ui[ k ] to denote the ith input to the node in
layer k, and o [ k ] the output of the node in layer k. 3. Proposed Architecture for Fault diagnosis
Another issue to mention here is the initial values of the and Prognosis
context nodes. Since TNFS is a recurrent network, the
initial values are essential to the temporal output of the Faults are usually the main cause of loss of
network. Usually they are preset to 0, as zero-state, but productivity in the process industry. This section uses a
non-zero initial state is also needed for some particular straightforward architecture to detect, isolate and identify
case. faults.
Layer 1. There is only one input to each node at
One of the most important types of systems present in
layer 2. The Gaussian function is adopted here as the the process industry is workshop of SCIMAT clinker . A
membership function: fault in a workshop of SCIMAT clinker may lead to a halt
in production for long periods of time. Apart from these
economic considerations faults may also have security
implications. A fault in an actuator may endanger human
where cr and sr give the center (mean) and lives, as in the case of a fault in an elevators emergency
width(variation) of the corresponding u[1] linguistic term brakes or in the stems position control system of a nuclear
power plant. The design and performance testing of fault
of input u[ 2 ] in Rule r.
diagnosis systems for industrial process often requires a
Layer 2. this layer has several nodes, one for simulation model since the actual system is not available
figuring matrix A and the other for B. Though we can use to generate normal and faulty operational data needed for
many nodes to represent the components of A and B design and testing, due to the economic and security
reasons that they would imply.
separately, it is more convenient to use matrices. So with a
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 242
Figure 5 shows a view and the schematics of a having an initial partition will be modified with the length
typical industrial industrial process of manufacture of of the phase of training (a number of sets fuzzy for each
cement. This installation belongs to cement factory of variable). The reasoning for the diagnosis and prognosis is
Ain-Touta (SCIMAT) ALGERIA. This cement factory described in the form of fuzzy rules inside our Neuro-
have a capacity of 2.500.000 t/an " Two furnaces " is made fuzzy system.
up of several units which determine the various phases of
the manufacturing process of cement. The workshop of
cooking gathers two furnaces whose flow clinker is of
1560 t/h. The cement crushing includes two crushers of Table 4.1 faults description
100t/h each one. Forwarding of cement is carried out
starting from two stations, for the trucks and another for Fault Description Inceptient/
the coaches. Abrupt
F1 Chute de la jupe I/A
7 F2 bourrage I/A
F3 No break I/A
1 drying F4 Transporateur auget I/A
F5 Presence anneaux I
2 F6 Mauvaise homognisation I/A
Pre heatage F7 Chute de crotage I/A
F8 Atteinte des briques rfractaires I
Decarbonation F9 bourrage I/A
5
9 4 F10 Moteur ventilateur tirage I/A
6 F11 Courroies ventilateur tirage I/A
10
Our TNFS must have a number of inputs equal to the
Clinkering
3
8 11
number of variables sensor signals providing the ability to
extend the timing window used for this problem have 27
inputs nodes comprised of 11 sensors signals at 4
successive time points at steps of 10 minutes, resulting in a
Fig 5. Workshop of SCIMAT clinker temporal window of 40 minutes for each sensor .
The TNFS provides 14 outputs representing the 14
3.1 Faults possible classes (faults): 11 process faults, 3 sensor faults
and normal state.
The workshop of SCIMAT clinker may be affected by a
number of faults. These faults are grouped into four major
categories: heating tower faults, Kiln Cycling faults, cooler 3.2 Training TNFS
balloons faults and gas burner faults. Here only abrupt or
incipient faults are considered. To train the TNFS ,we used scenario for each of the 11
This step has an objective of the identification of the possible faults. The process was simulated for 120
dysfunctions which can influence the mission of the minutes, with the faults starting to appear after 40 minutes
system. This analysis and recognition are largely of normal operation. So, we had 9 different positions of
facilitated using the structural and functional models of the the temporal window (0-40 mins,10-50 mins, etc..),
installation. For the analysis of the dysfunctions we providing 342 input/output vector pairs for training.
adopted the method for the analysis of the dysfunctions we
adopted the method of Failure Modes and Effects Analysis NEFDIAG(Neuro-Fuzzy Diagonsis) is a data
and their Criticality (FMEAC). processing program for interactive simulation. The
While basing itself on the study carried out by [6], on NEFDIAG development was carried out within LAP
the cooking workshop, we worked out an FMEAC by (University of Batna), was primarily dedicated to the
considering only the most critical modes of the failures creation, the training, and the test of a Neuro-Fuzzy system
(criticality >10), and for reasons of simplicity [46].
for the classification of the breakdowns of a dedicated
Therefore we have a Neuro-fuzzy system of 27 inputs and
industrial process. NEFDIAG models a fuzzy classifier Fr
11 outputs which were used to make a Prognosis of our
with a whole of classes C = {c1, c2...... cm}[45].
system. The rules which are created with the system are
knowledge a priori, a priori the base of rule. Each variable
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 243
NEFDIAG makes its training by a set of forms and The NEFDIAG system typically starts with a knowledge
each form will be affected (classified) using one of the base comprised of a collection partial of the forms, and can
preset classes. Next NEFDIAG generates the fuzzy rules refine it during the training. Alternatively NEFDIAG can
by: evaluating of the data, optimizing the rules via training start with an empty base of knowledge. The user must
and using the fuzzy subset parameters, and partitioned the define the initial number of the functions of membership
data into forms characteristic and classified with for partitioning the data input fields. And it is also
parameters of the data. NEFDIAG can be used to classify necessary to specify the number K, which represents the
a new observation. The system can be represented in the maximum number of the neurons for the rules which will
form of fuzzy rules be created in the hidden layer. The principal steps of the
training algorithm.
If symptom1(t) is A1 Symptom2(t-2) is A2
The data set used in this experiment contained 200
Symptom3(t) is A3 Symptom N (t-1) is An
samples. Each data sample consisted of 27 features
Then the form (x1, x2, x3..., xn) belongs to class fault i.
comprising the temperature and pressure
measurements at various inlet and outlet points of the
For example A1 A2 A3 An are linguistic terms represented
rotary kiln, as well as other important parameters as shown
by fuzzy sets. This characteristic will make it possible to
in Table 4.2. The heat transfer conditions were
complete the analyses on our data, and to use this
classified into two categories, i.e., the process of heat
knowledge to classify them. The training phase of the
transfer was accomplished either efficiently or
networks of artificial Neuro-Fuzzy systems makes it
inefficiently.
possible to determine or modify the parameters of the
From the database, there were 101 data samples
network in order to adopt a desired behavior. The stage of
(50.18%) that showed inefficient heat transfer
training is based on the decrease in the gradient of the
condition, whereas 99 data samples (49.82%) showed
average quadratic error made by network RNF[44].
efficient heat transfer condition in the rotary kiln.
The data samples were equally divided into three
subsets for training, prediction and test.
Figure The diagnosis by NEFDIAG.
5. phase
Analysis
Sensor Data
(Entry Victor)
FMEAC Table 4.2 input and output variables for the rule compiling
Fig. 11. Effect of incipient fault F10 on the Rotary kiln rotating RPM
4. Conclusion
This TNFS was used for identification, prediction and roughness prediction in deep drilling International Journal of
detection of the fault process in the cement rotary kiln, Intelligent Manufacturing,22(2):120-131.
11. Koscielny JM and Syfert M (2003) Fuzzy logic applications to
back end temperature was used as the process monitor of diagnostics of industrial processes. In: SAFEPROCESS'2003,
the various conditions. The special character of this Preprints of the 5th IFAC Symposium on fault detection,
variable is that it can show the normal and abnormal supervision and safety for technical processes, Washington, USA,
conditions inside the kiln. pp. 771-776.
In spite of great importance of fuzzy neural networks 12. Xi F, Sun Q, Krishnappa G (2000) Bearing Diagnostics Based
on Pattern Recognition of Statistical Parameters. Journal of
for solving wide range of real-world problems,
Vibration and Control 6:375392
unfortunately, little progress has been made in their 13. Patton RJ, Frank PM and Clark RN (2000) Issues of Fault
development. Diagnosis for Dynamic Systems. Springer, London
We have discussed recurrent neural networks with 14. Chia-Feng Juang (2002) A TSK-Type Recurrent Fuzzy Network for
fuzzy weights and biases as adjustable parameters and DynamicSystems Processing by Neural Network andGenetic
internal feedback loops, which allows capturing dynamic Algorithms, ieee transactions on fuzzy systems, vol. 10, no. 2
response of a system without using external feedback 15. Bocaniala CD, Sa da Costa J and Palade V (2004) A
Novel Fuzzy Classification Solution for Fault Diagnosis.
through delays. In this case all the nodes are able to International Journal of Fuzzy and Intelligent Systems 15(3-4):195-
process linguistic information. 206
As the main problem regarding fuzzy and recurrent 16. Marinai L (2004) Gas path diagnostics and prognostics for aero-
fuzzy neural networks that limits their application range is engines using fuzzy logic and time series analysis (PhD Thesis).
the difficulty of proper adjustment of fuzzy weights and School of Engineering, Cranfield University.
biases, we put an emphasize on the TNFS training 17. Bocaniala CD and Sa da Costa J (2004) Tuning the Parameters
of a Fuzzy Classifier for Fault Diagnosis. Hill-Climbing vs.
algorithm.
Genetic Algorithms. In: Proceedings of the Sixth Portuguese
Conference on Automatic Control (CONTROLO 2004), 7-9
June, Faro, Portugal, pp. 349-354.
18. Jing He (2006), neuro-fuzzy based fault diagnosisfor nonlinear
References processes (PhD Thesis). the university of new brunswick.
2
Department of Computer Science & Information Systems, University Technology Malaysia,
Skudai Johor Bahru, 81310 Malaysia
Bluetooth protocol is an open standard for short-range 3.1 Parallel Random Number Generator
digital radio. The goal of Bluetooth is to connect devices
(PDAs, cell, phones, printers, faxes, etc.) together Linear feedback shift registers (LFSRs) are very applicable
wirelessly in a small environment such as an office or in parallel random number generators. Due to the
home. The Bluetooth has three different encryption modes simplicity of implementing the LFSRs in structure of
to support the confidentiality service as follows: hardware and software, LFSRs are use in many of random
number generators. LFSRs can generate different
Mode 1: No encryption is performed on any data. sequences with good statistical properties and large length
Mode 2: Broadcast traffic is not encrypted, but the of period. With notice that, the equation of polynomial
individually addressed traffic is encrypted feedback plays very important role in LFSRs. If the
according to the individual link keys. feedback polynomial equation is primitive, it is means that
Mode 3: All traffic is encrypted according to the master an LFSR with length can generate maximal length of
link key. sequence equal to 2 1. Furthermore, due to feature of
output linearity stream, the output sequences of LFSR are
Bluetooth is working on the base of E0 algorithm. Until easily expectable and if the designers want to trap more
now, there are many known attacks on the encryption than one stream as outputs of sequence, each bit is exactly
scheme E0 are available that can threaten the security of equal to others bits with time delay (maximum delay is
Bluetooth. The most well-known of them are algebraic length of LFSR-bit or n). This problem is threatening the
attacks [14] and correlation attacks [15-16]. system from different viewpoint especially from
correlation attack. In this paper, one model of LFSR has
E0 generates a bit using four shift registers with differing designed in such a way that can solve this big problem;
lengths (25, 31, 33, 39 bits). The Figure 3 shows the hence it is important in cryptography.
involved algorithm use in the Bluetooth standard.
Designing a parallel random number generator (PRNG) by
However, in E0 like A5/1 and A5/2, the last function that using one LFSR has the feature that one can construct a
generates key stream is simple XOR. Due to the linear linear sequential system which is correctly initialized and
properties of XOR, the output key stream has linear for each clock cycle generates different consecutive stream
relation with its inputs that it may threaten the whole of of the sequences, while the normal LFSR would generate
algorithm. just one stream sequence. In fact, each bit-output of the
finite state machine can be XORed together to form the
key-stream output.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 250
LFSRs are defined by characteristic polynomials which the speed of processing is important parameters that it is
determine all properties of the sequences produced by an possible to implement designed PRNG by software and
LFSR. Parallel Random Number Generators (PRNGs) are hardware to obtain suitable speed of processing. In this
defined by very specific polynomials. The most properties regard, the Eq. (3) initially designed as a primitive
of this kind of generators are that those have been used in candidate polynomial equation that can be satisfied the
practice and at large scale of encryption in symmetric form of Eq. (4).
cryptography. The estimation of the number of primitive
polynomials in PRNGs related to LFSRs can be calculated x257 +x254 +x251 +x249 +x244 +x243 +x242 +x238 +
from Eq. (1), where v is the number of sub-registers. x +x233 +x232 +x230 +x228 +x226 +x225 +x221 +x220 +
237
, ,
1 1 1 (5)
, ,
, ,
, ,
, ,
Figure 4. In fact, the diagram that shown in Figure 4 is bits as output of algorithm. So, each bit of key stream
equal to Eq. (5) or Eq. (3). Therefore, we can select just output is simplifies as Eq. (8).
115 traps as bit-stream output from Figure 4. However, it
is advisable to implement a multiplexor for sequence For convenient explanation of Eq. (8), the Figure 5 shows
selection to increase the nonlinearity of each stream. the Eq. (8) as function box with totally 115-bits. In fact,
Finally, each sequence can be selected according to Eq. (6) the functionality of Figure 5 is exactly Eq. (8). Therefore,
as follows: the functionality of is a function with 5 input
variables and 1-bit output that operates instead of Eq. (8).
, , ; 1 31 ; 1 3 The important statistical cryptography tests have applied
on .
, , ; 1 23 ; 1 5
, , ; 1 5 ; 1 7 (6)
As it has shown in Eq. (10), which have derived from Furthermore, an efficient designed stream cipher algorithm
Table 1 (in Appendix), the result check of correlation can be implemented in GSM, WEP, SSL, TLS and
immunity is excellent for designed function. On the other Bluetooth protocols. The new algorithm has designed base
hand, from correlation calculations point of view, we have on parallel random number generator with the high speed
calculated all of possibility of designed function. All of of processing which can be implemented in high speed
results are equal to zero. It is because the correlation data/voice link of communication and it can resist in front
coefficients have boundaries of -1 and +1. A value of +1 of different kinds of attacks such as correlation and
indicates perfect positive linear relationship between two algebraic.
sequences, while -1 is a perfect negative linear relationship
between them. A value of zero indicates no correlation The designed algorithm has passed all of cryptographic
between input variables or independent. In designed tests in NIST standard successfully. The designed new
function, the correlations for five variables are excellent. algorithm can support the encryption/decryption with rate
Therefore highly non-linear balanced Boolean function of 100 MB/s. The key variety of designed algorithm is
with an excellent Correlation-Immunity is enough strong equal to 2 and the length key of IV is equal to 2 . It
in faced to correlation attack. can be implemented easily by hardware and software.
3.2.4 Algebraic Degree Check This paper designed a new stream cipher algorithm with
key variety of 2 and 115-bit IV that is more secure than
The algebraic degree is one of the nonlinearity measures of other public one from speed of processing and others
Boolean function. The Boolean functions with small viewpoint of security.
algebraic degree are in general considered to be less
suitable for cryptographic applications than those with
higher degree. However there are large classes of
cryptographically strong Boolean functions with small
algebraic degree such as quadratic bent functions. It is
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 253
2
ICTEAM, Electrical Engineering, Universit catholique de Louvain,
Louvain-La-Neuve, 1348, Belgium
2. Antenna Design
1. Introduction
A CRMPA is designed on a dielectric layer
Recently, there has been a growing demand of microwave, RO4003C substrate which has a relative permittivity and
and wireless communication systems in various thickness of 1.524 mm. As shown in Figure 2.a, the patch
applications resulting in an interest to improve antenna antenna has a length (L) of 30 mm and a width (w) of 21
performances. Modern communication systems and mm and its resonant frequency is 2.40 GHz. The resonant
instruments such as Wireless local area networks (WLAN), frequency, also called the center frequency, is selected as
mobile handsets require lightweight, small size and low the one at which the return loss is minimum. An etched
cost. The selection of microstrip antenna technology can RS-DGS with different length values and a fixed width
fulfill these requirements [1]. WLAN in the 2.4 GHz band (3.5 mm) is then inserted into the ground plane of the
(2.4-2.483 GHz) has made rapid progress and several original CRMPA shown in figure 1 (Ant.1) at different
IEEE standards are available namely 802.11a, b, g and j positions as shown in figure 2.a (Ant.2), figure 2.b (Ant.3)
[1]. Various design techniques using defected ground and figure 2.c (Ant.4).
structure (DGS) in the patch antenna have been suggested
in previous publications [2-4]. DGS is realized by etching In Figure 2, the RS-DGS is drawn with dash lines to
a defect in the ground plane of planar circuits and indicate that it is located on the bottom of the substrate.
antennas. This defect disturbs the shield current Except the insertion of a rectangular shape slot to the
distribution in the ground plane and modifies a ground plane, no other modification has been performed to
transmission line such as line capacitance and inductance the antenna patch and the feeding system.
characteristics [5]. Accordingly, a DGS is able to provide
a wide band-stop characteristic in some frequency bands
with a reduced number of unit cells. Due to their excellent
pass and rejection frequency band characteristics [5], DGS
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 255
RS
(b)
RS
RS
(c)
-5
-10
Return Loss [dB]
-15
-20
Ant. 1
dB(S(1,1))
-25 Ant. 2
dB(S(2,2))
Ant. 1
Ant. 3
dB(S(3,3)) Ant. 3
-30
Ant. 4
dB(S(4,4))
Ant. 4
-35
2.20 2.25 2.30 2.35 2.40 2.45 2.50 2.55 2.60
Frequency [GHz]
References
[1] Yen-Liang Kuo, Kin-Lu Wong, Printed double-T monopole
antenna for 2.4/5.2 GHz dual-band WLAN operations,
IEEE Trans. Antennas Propagation, vol. 51, pp. 21872192,
September, 2003
[2] M. K. Mandal, P. Mondal, S. Sanyal, and A. Chakrabarty ,
An Improved Design Of Harmonic Suppression For
Microstrip Patch Antennas, Microwave and Optical
Technology Letters, pp. 103-105 Vol. 49, No. 1, January
2007.
Ant. 1 [3] Haiwen Liu, Zhengfan Li, Xiaowei Sun, and Junfa,
Ant. 3 Harmonic Suppression With Photonic Bandgap and
Ant. 4 Defected Ground Structure for a Microstrip Patch Antenna,
IEEE Microwave and Wireless Components Letters, VOL.
15, NO. 2, Feb. 2005.
[4] Y. J. Sung, M. Kim, and Y.-S. Kim, Harmonics Reduction
With Defected Ground Structure for a Microstrip Patch
Fig. 6 H-plane radiation patterns of the CRMPA and Antenna, IEEE Antennas and Wireless Propagation Letters,
the antennas with RS-DGS. VOL. 2, 2003.
[5] D. Ahn, J. S. Park, C. S. Kim, J. Kim, Y. Qian, and T. Itoh,
Table 1 summarizes the obtained simulation features of A design of the low-pass filter using the novel microstrip
the designed antennas. defected ground structure, IEEE Trans. Microwave Theory
Tech., vol. 49, pp. 8693, Jan. 2001.
[6] C. S. Kim, J. S. Park, D. Ahn, and J. B. Lim, A novel 1-D
Table 1: The obtained simulations features
periodic defected ground structure for planar circuits, IEEE
Microwave Guided Wave Lett., vol. 10, pp. 131133, Apr.
Antennas
Resonance
RL Gain 2000.
types
Freq. Material
[dB] [dB] [7] C.A. Balanis, Antenna theory: analysis and design, Third
[GHz] edition, John Wiley & sons Inc., 2005.
RO4003C [8] T. A. Milligan, Modern antenna design, John Wiley &
Ant. 1 : Sons, INC, 2005.
2.4 r 3.4 -15.72 5.1
CRMPA
H=1.524 mm
Ant. 3 : Mouloud Challal was born on March
RO4003C 06th, 1976, in Algiers, Algeria. He
CRMPA
2.4 r 3.4 -26.92 5.9 received the electronics and
with RS- communication engineering degree
DGS H=1.524 mm from the Universit des sciences et
Ant. 4 : RO4003C de la Technologie Houari
CRMPA Boumediene, Algiers, Algeria, in
2.4 r 3.4 -31.87 5.9 April 1999, and the M.Sc. degree in
with RS-
DGS H=1.524 mm microwave and communication from
the Ecole Nationale Polytechnique,
Algiers, Algeria, in Dec. 2001. From 1998 to 1999, he acted as
computer engineer in private company; in charge of
4. Conclusions maintenance, computer network installation (LAN), Algiers.
From 1999 to 2002, he taught computer science in a public
A simple technique to improve conventional rectangular institute (Ex- ITEEM), Algiers. Since 2004, he is a lecturer and a
researcher in the Institute of Electrical and Electronics
microstrip patch antenna (CRMPA) characteristics by
Engineering, (IGEE, Ex. INELEC), University of Boumerdes
adding an etched rectangular slot in the ground plane (RS- (UMBB), Boumerdes, Algeria. Since 2007/2008 academic year,
DGS) is presented in this paper. Simulation results have is registered as researcher/PhD student at both UMBB and
shown that inserting RS-DGS improves the antenna Universit catholique de Louvain (UCL), Louvain-a-Neuve,
performances. For the considered CRMPA, the results Belgium. His research interests include RF/Microwave circuits,
design and analysis of microstrip filters, defected ground
show a 100 % enhancement of the return loss and a 0.8 dB structures behaviors, wireless communication systems,
improvement of the gain for the configurations named Ant. microstrip antenna array analysis, synthesis and design. He is
3 and Ant. 4. A further work focusing on the effect of the an IEEE, EuMA, IAENG and SDIWC Member.
RS-DGS position and parameters is essential to end up
with an antenna configuration with optimal performances.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 258
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 259
2
University of tunis High Institute of Management
Bouchoucha city, Bardo 2000, TUNISIA
3
University of tunis High Institute of Management
Bouchoucha city, Bardo 2000, TUNISIA
However, these methodologies often lack tool support to have chosen a use case in the pharmacotherapeutic
facilitate their application in practice and encourage domain. The authors present a good approach however it
companies to adopt them. is specific to a well defined area (pharmacotherapeutic).
The present work is in the context of engineering models In [6] authors proposed a tool NT2OD which derives an
such as MDA (Model Driven Architecture) which is a initial object diagram from textual use case descriptions
process based on the transformation of models: model to using natural language processing (NLP) and ontology
model, code to model, model to code, etc. It presents an learning techniques. NT2OD consists of creating a parse
experience in the application of requirements tree for the sentence, identifying objects and relations and
specifications expressed in natural language into generating the object diagram.
structured specifications. In our work we propose a CASE tool (Computer-aided
The proposed application having an input text data that software engineering). We extract information from users
represent user requirements identifies named entities requirements to generate class diagram taking in account
(entities, properties and relationships between entities ....) existing approaches. We propose a design tool which
to classify them in a structured XML file. Several extracts UML concepts and generate UML class diagram
researchers have tried to automate the generation of an according to different concepts (class, association,
UML diagram from a natural language specification. attribute). The idea is to use GATE API1 and we extended
Kaiya et al. [8] proposed a requirements analysis method it by new JAPE rules to extract semantic information from
which based on domain ontologies. However, this work user requirements.
does not support natural language processing, it allows the
detection of incompleteness and inconsistency in
requirements specifications, measurement of the value of
the document, and prediction of requirements changes. 3. GATE overview
In [2] Christiansen et al. developed a system to transform
use case diagram to class diagram Defnite Clause
Grammars extended with Constraint Handling Rules. The GATE General Architecture for Text Engineering is
grammar captures information about static world (classes developed by the Natural Language Processing Research
and their relations) and subsequently the system generates Group 2 at the University of Sheffield 3 . GATE is a
the adequate class diagram. This work is very interesting framework and graphical development environment,
but the problem that organizations requirements are not which enables users to develop and deploy language
always modeled as use case diagram. engineering components and resources in a robust fashion
The work in [10] implemented a system named [4]. GATE contains different modules to process text
GeNLangUML (Generating Natural Language from documents. GATE supports a variety of formats (doc, pdf,
UML) which generates English specifications from class xml, html, rtf, email) and multilingual data processing
diagrams. The authors translate UML version 1.5 class using Unicode as its default text encoding.
diagrams into natural language. This work was considered In the present work we use the information extraction tool
by most developers as an efficient solution for reducing ANNIE plugin (A Nearly-New IE system) (Fig. 1). It
the number of errors and verification and an early contains Tokeniser, Gazetteer (system of lexicons), Pos
validation of the system but we need for all time to Tagger, Sentence Splitter, Named Entity Transducer, and
generate UML diagram from natural language. The system OrthoMatcher.
process is as follows: - Tokeniser: this component identifies various symbols in
Grammatical labeling based on a dictionary wordnet to text documents (punctuation, numbers, symbols and
disambiguate the lexical structure of UML concepts. different types). It applies basic rules to input text to
Sentences generation from the specification by identify textual objects.
- Gazetteer: gazetteer component creates annotation to
checking attributes, operations and associations with
reference to a grammar defining extraction rules. offer information about entities (persons,
Checking if the generated sentences are semantically organizations) using lookup lists.
- POS Tagger: this component produces a tag to each
correct.
word or symbol.
Generating a structured document containing the
- Sentence splitter: sentence splitter identifies and
natural specification of a natural class diagram.
annotates the beginning and the end of each sentence.
Hermida et al. [7] proposed a method which adapts UML
class diagrams to build domain ontologies. They describe 1 http://gate.ac.uk/
the process and the functionalities of the tool that they 2
have developed for supporting this process. The authors http://nlp.shef.ac.uk/
3 http://www.shef.ac.uk/
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 261
JAPE rule (Java Annotation Patterns Engine) a variant To extract association concept we use Jape rule illustrated
adapted to the Java programming language consists in files in Figure 4 (fig. 4). If the token belong to gazetteer lists
containing a set of rules [3]. Gazetteer lists are lookup lists (lines 7, 9, 11, 13), it will be annotated as association
with one entry per line containing names of people large otherwise the instructions from line 16 will be executed: if
organizations, months of the year, days of the week, the token belongs to the class list, the second token is a
numbers, etc [10]. "verb", and that the third word belongs to the list "Class",
In class diagram usually have the following format: then the second word (token) will be annotated as an
association.
Noun+verb+Noun
Fig. 5 JAPE rule for extracting attribute. In this phase, we have formed a corpus of users
requirements in different areas. Then, we have tested our
system on this corpus. We applied GATE which generates
In this step, we propose a graphical representation of an XML file containing all semantic tags. We clean the
JAPE rules set used by all modules of ANNIE components file by removing unnecessary tags like <sentence>,
that we have integrated in our application (fig. 6). <token> Figure 7 (fig. 7) shows an example of output
GATE file. Our tool is robust and efficient and the error
rate is very low, except that case studies are very
complicated.
5. Conclusion
readability for the designer. This work is already in computer science in April 2010 from High Institute of
Management. Her research interest includes natural language
underway. processing, semantic annotation, and web service.
2
National Institute of Applied Science and Technology
BP. 676 centre urbain cedex Tunis, Tunisia
adequate representation to emphasize the non-Gaussian independent component analysis. The Principle of ICA can
nature of mixture signals. be depicted as in Figure 1.
Where X(t)=[x1(t)xn(t)]T is a vector of mixture signals, The negentropy can be considered as the optimal measure
S(t)=[s1(t)sm(t)]T is the unknown vector of sources of the non gaussianity. However, it is difficult to estimate
signals and A is the unknown mixing matrix having the true negentropy. Thus, several approximations are used
dimension (m*n). and developed such the one developed by Aapo Hyvarinen
Independent Component Analysis is a typical BSS method et al [1], [12]:
p
which tends to solve this problem. The purpose of the J ( y ) ki E gi ( y ) E gi ( ) (4)
2
ICA
3. Undecimated Wavelet Packet-Perceptual
s1 x1 1 Filterbank
s2 A x2 W=A-1 2
3.1 Wavelet Transform
Fig. 1 Principle of ICA.
Wavelet Transform [5], [18], represents an alternative
In other words, ICA can be defined as a method that technique for the processing of non-stationary signals
researches a linear transformation, which maximizes the which provides a linear powerful representation of signals.
non-Gaussianity of the components of S(t). To measure the The discrete wavelet transforms (DWT) is a multi-
non gaussianity, kurtosis or differential entropy called resolution representation of a signal which decomposes
negentropy can be employed. FastICA algorithm [12], [1], signals into basis functions. It is characterized by a higher
[8] is one of the most popular algorithms performing time resolution for high frequency components and a
higher frequency resolution for low frequency components.
The DWT consists on filtering the input signal by two
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 267
Where i=(0,1,..,5) and j=(0.., 2-j-1) are respectively the 9 1000 160
number of levels and the position of the node and Fs is the 10 1170 190
sampling frequency. 11 1370 210
12 1600 240
13 1850 280
14 2150 320
15 2500 380
16 2900 450
17 3400 550
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 268
select UWPD coefficients of two mixtures signals x1(n) to the subjective "Mean Opinion Score" (MOS) measured
and x2(n) which obtained in preprocessing module are score.
used as two inputs signals of FastICA algorithm. In the
second step, the separated signals are obtained by taking In the previous experiments, we compare our system with
into account the original mixtures signals. FastICA algorithm [12] and two well-known algorithms
Jade [14] and SOBI [13].
5. Results and Evaluation The experimental results are shown in three tables which
reports the evaluation measures obtained for three example
To evaluate the performance of the proposed blind speech cases of mixture signal. Table 2 lists the separate
separation Method, described in section 4. We use some performance measures including ratio SIR and SDR
sentences taken from TIMIT database, this database obtained after separation by Sobi, Jade, FastICA and the
consists of speech signals of a total of 6300 sentences proposed method. We observed that the SIRSDR and
formed by 10 sentences spoken by each of 630 speakers their values is better for the proposed method than that of
from 8 major dialect regions of the United States [23]. We FastICA, jade and SOBI in the majority of cases for the
consider two speech mixtures composed of two speakers, two signals. The SIR average where we have a mixture
so we mixes in instantaneous two speech signals, which composed with two female speakers (or experiment 2) for
are respectively pronounced by male and female speaker, exemple,is 14.06 for SOBI, 43.12 db for Jade, 39.80 for
two female speakers and two male speakers. The two FastICA and 45.10 db for proposed method. The
speech mixtures are generating, artificially, using mixing improvement in the SIR and SDR ratio average is
matrix as: particularly significant in the case of mixture observed
formed by two male speaker signals. The improvement
2 1 (8) average in this case between the proposed method and
A
1 1 FastICA is 15.45 db.
The performance evaluation of our work includes different Table 3 and table 4 shows that the estimated signals
performance metrics such as the blind separation obtained by using the proposed method is better than those
performance measures used in BSS EVAL [19], [30], obtained by FastICA and the two algorithms Jade and
including the signal to interference ratio SIR and the signal SOBI for the three experiments. We have obtained, for
to distortion ratio SDR measures. The principle of these exemple, seg SNR egale to 33.90 db using proposed
measures consists on decomposing the estimated signal method and 29.14 db using FastICA.
si(n) into the following component sum:
si (n) st arg et (n) sint erf (n) sartefact (n) (9) In order to have a better idea about the quality of estimated
signal obtained, PESQ has been used. It is regarded as one
where starget(n), einterf(n) and eartefact(n) are, respectively, an of the reliable methods of subjective test. It returns a score
allowed deformation of the target source si(n) an allowed from 0.5 to 4.5. Table 5 illustrates the PESQ score
deformation of the sources which takes account of the obtained. We see that the proposed method is still more
interference of the unwanted sources and an artifact term effective in terms of perceptual quality than FastICA, jade
which represents the artifacts produced by the separation and SOBI.
algorithm. The two performance criteria SIR and SDR are
computed using the last decomposition as following:
2
st arg et ( n) (10)
SIR 20 log 2
sin terf ( n)
2
st arg et ( n) (11)
SDR 20 log 2 2
sin terf ( n) s artefact ( n)
Table 2: Comparison of SIR and SDR using SOBI, Jade, Fast-ICA and proposed Method (PM)
Table 3: Comparison of segmental SNR using SOBI, Jade, FastICA and proposed Method (PM)
Table 4: Comparison of overall SNR using SOBI, Jade, FastICA and proposed Method (PM)
Table 5: Comparison of PESQ using SOBI, Jade, FastICA and proposed Method (PM)
2
Construction Engineering Department, Zagazig University
Zagazig, Egypt
3
Construction and Building Department, Arab Academy for Science, Technology and Maritime Transport
Cairo, Egypt
with respect to the cost elements used to compile a bid development of a list for the main factors affecting the
proposal and to identify the types of methods used for building projects overhead costs. They will be used during
estimating these elements. Their results indicated that the development of the model. Such factors were mainly
direct cost and project overhead costs are estimated by identified based on the experts opinions from selected
contractors primarily in a detailed manner, which is groups of prominent industrial professionals and qualified
contrary to the estimation of the general overhead costs academicians from the most prominent universities in
and the markup [9]. Egypt. The principal objective of this survey study was to
Assaf, S. A. et al. (2001), investigated the overhead cost reinforce the potential model, based on the experts
practices of construction companies in Saudi Arabia. They opinions from the aforementioned expert professionals
show how the unstable construction market makes it [12].
difficult for construction companies to decide on the Expert opinion included the reviews from nineteen
optimum level of overhead costs that enables them to win prominent industrial professionals and sixteen qualified
and efficiently administer large projects [4]. academicians from the American University in Cairo and
Cost estimating models and techniques provides a well the Arab Academy for Science and Technology and
defined engineered calculation methods for the evaluation Maritime Transport. Reviews from experienced industrial
and assessment of all items of office overhead, project professionals were essential for developing the overall
overhead, profit anticipation, total project cost estimation, model as these professionals are directly associated with
and the assessment of overhead costs for construction the leading Egyptian building construction firms.
projects that leads to competitive bidding in the Each expert from both contractor and academic
construction industry [11]. background were approached based on their personnel
This paper presents the steps followed to develop a experiences. Half of the responses were obtained via
proposed model for site overhead cost estimating. The personnel interviews and the other half were obtained
necessary information and the required projects data were through delivering the questionnaire and collecting back
collected on two successive yet dependent stages: the same, E-mail or Fax.
I. Comparison between the list of site overhead factors As this phase of seeking experts opinion consist of the
collected from previous studies and the applied walk-through observations of the selected specified
Egyptian site overhead list of factors that is adapted by industrial professionals and academicians connected to the
the first and second categories of construction firms in construction industry. These reviews provided us with
Egypt; and qualified remarks and suggestions, which will lead to
II. Collection of all required site overhead cost data for a making the necessary alterations on the list of the
sample of projects in Egypt to be used during the previously identified overhead cost factors to make it
analysis phase and site overhead cost assessment adaptable to the Egyptian building construction industry
model development. market. This is an essential step to have a more firm and
yardstick final model for the assessment of overhead costs
for building construction projects, in Egypt [12].
2. Research Methodology
The findings from the survey conducted on all the 3. Data Collection
previous researches served as key source in the
identification of the main factors affecting site overhead This phase is divided into two stages; first stage is to
costs for building construction projects. Based on an perform a comparison between the overhead cost factors
extensive review for the previous studies conducted in this from the comprehensive literature study and the Egyptian
area of work, the survey for such factors mainly include construction industry. Hence, the main factors affecting
projects need for specialty contractors, percentage of sub- site overhead costs can be clearly identified. The second
contracted works, consultancy and supervision, contract stage is to collect data for 50 projects from several
type, firms need for work, type of owner/client, site construction companies that represent the first and the
preparation, projects scheduled time, need for special second categories of construction companies, in Egypt
construction equipment, delay in projects duration, firms [12].
previous experience with projects type, legal
environmental and public policies for the home country, 3.1 The questionnaire
projects cash-flow plan, project size, and projects location.
Hence, the study shed a great deal of light on the area of In the first section of the data collection process, a
site overhead costs for building construction projects in questionnaire is prepared to investigate the main factors
Egypt. Through seeking the experts opinions regarding the affecting site overhead cost for building construction
projects in Egypt.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 275
The analysis of the collected questionnaires illustrated that construction technique is required for a certain project it
there is a difference between the factors that govern the must be accounted for by the construction firm cost
assessment of building construction site overhead cost in estimating department in an exceptional manner [12].
Egypt and the international building construction industry
trend. Many factors are not accounted for in Egypt due to
its insignificance in the local market while it is a great 5. Comparative Analysis Results
contributor in both Europe and North/South America
construction markets. Moreover, in Egypt there is a trend The major and minor findings of the entire research were
between contractors to combine two or more contributing summarized in this part of the research. Based on the
items in one main factor. The academicians contravened findings the current and further recommendations are
with that behavior and characterized it to be an developed as the base for further research in the very
unprofessional attitude because it depends entirely on the context of building construction projects overhead cost for
person that is performing the task and his/her experience the first and the second categories of construction
with the projects on hand (personalization). So after cross- companies, in Egypt [12].
matching and making the necessary alterations on the The analysis illustrated many facts that needed to be
questionnaires collected from both the contractors and clarified and understood about the percentage of site
academicians in Egypt, a final list of factors were overhead costs for building construction projects in Egypt.
generated that represent both the parties and it can These facts will be the structure (backbone) for the
accurately represent the factors that contribute to building development of a model for the assessment of site
construction site overhead cost in the Egyptian overhead cost as a percentage from the total contract
construction market (Table 1) [12]. amount for building construction projects, in Egypt. This
can be simply summarized in the following two facts: [12]
Table 1: Factors Contributing to Construction Site Overhead Cost
Percentage in Egypt
A. Through the literature review and the experts
opinions potential factors that are found to influence
Factor the percentage of site overhead costs for building
1 Construction Firm Category.
2 Project Size.
construction projects in Egypt, ten factors were
3 Project Duration. identified.
4 Project Type. B. The analysis of the collected data gathered from fifty-
5 Project Location. two real life building construction projects from
6 Type-Nature of Client. Egypt during the seven year period from 2002 to
7 Type of Contract.
2009, illustrated that project's duration, total contract
8 Contractor-Joint Venture.
9 Special Site Preparation Requirements. value, projects type, special site preparation needs and
10 Project need for Extra-man Power. projects location are identified as the top five factors
that affect the percentage of site overhead costs for
building construction projects in Egypt.
4. Site Overhead Cost Data
A comparative analysis was performed between building 6. Neural Network Model
construction site overhead cost and each constituent of site
overhead regarding building construction projects, with The guidelines of N-Connection Professional Software
the aid of (52) completed building construction projects. version 2.0 (1997), users manual were used to obtain the
These projects were executed during the seven year period best model. Moreover, for verifying this work the
from 2002 to 2009. The comparison is made in terms of traditional trial and error process was performed to obtain
cost influence for each factor of projects site overhead on the best model architecture [11].
the percentage of projects site overhead cost in order to The following sections present the steps performed to
recognize and understand the governing relationship design the artificial neural network model, ANN-Model.
between each factor and the percentage of site overhead Neural network models are generally developed through
cost [12]. the following basic five steps [8]:
It must be illustrated that for all the collected projects the 1. Define the problem, decide what information to use and
adapted construction technology was typical traditional what network will do;
reinforced concrete technology. This may be due to the 2. Decide how to gather the information and represent it;
participating experts opinion, because that technology 3. Define the network, select network inputs and specify
represents over (95%) of the adopted building construction the outputs;
technology in Egypt. Contrarily, if any specific 4. Structure the network;
5. Train the network; and
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 277
6. Test the trained network. This involves presenting new trained satisfactory, adding or removing of hidden layers
inputs to the network and comparing the networks and hidden nodes will be performed until an acceptable
results with the real life results, (Fig. 3). model structure is reached, that can predict the percentage
of site overhead cost with an acceptable error limit. The
Define the Problem learning rate, training and testing tolerance are fixed by
the N-Connection V 2.0 automatically [16].
Gather Data and Design ii. Determining the Best Network Architecture
the Neural Network
There are two questions in neural network designing that
have no precise answers because they are application-
dependent: How much data do you need to train a
Train Network network? And, how many hidden layers and nodes are the
best numbers to use? In general, the more facts and the
fewer hidden layers and hidden nodes that you can use, is
No the better [16]. There is a subtle relationship between the
Train Successfully?
number of facts and the number of hidden layers/nodes.
Having too few facts or too many hidden layers/nodes can
Yes
cause the network to "Memorize". When this happens, it
performs well during training but tests poorly [16]. The
Test Network network architecture refers to the number of hidden layers
and the number of nodes within each hidden layer [16].
The two guidelines that are discussed in the following
No section can be used in answering the last two questions
Tested Successfully? [8].
Yes
iii. Determining the Number of Hidden Layers/Nodes
to be useful. However, it is not always possible for Neural minimizing a root mean square error (RMS) that is
Connection V2.0 to train if it begins with a very small expressed in the equation (1) [16]:
tolerance. In this study the tolerance is set by the program
- Equation (1):
to (0.1).
2. Learning Rate
The learning rate specifies how large an adjustment Neural
Connection will make to the connection strengths when it
gets a fact wrong. Reducing the learning rate may make it
possible to train the network to a smaller tolerance. The Where n is the number of samples to be evaluated in the
learning rate pattern is automatically set by the Neural training phase, Oi is the actual output related to the sample
Connection 2.0 Software program in a way that maximizes i (i=1...n), and Pi is the predicted output. The training
the performance of the program to achieve the best results. process should be stopped when the mean error remains
unchanged. The training file has (90%) of the collected
iv. Training the Network facts, i.e. has 47 facts (Projects). These facts are used to
Training the network is a process that uses one of several train and validate the network [11].
learning methods to modify weight, or connection
v. Testing the Network
strengths. All trial models experimented in this study was
trained in a supervised mode by a back-propagation Testing the network is essentially the same as training it,
learning algorithm. A training data set is presented to the except that the network is shown facts it has never seen
network as inputs, and the outputs are calculated. The before, and no corrections are made. When the network is
differences between the calculated outputs and the actual wrong, it is important to evaluate the performance of the
target output are then evaluated and used to adjust the network after the training process. If the results are good,
network's weights in order to reduce the differences. As the network will be ready to use. If not, this means that it
the training proceeds, the network's weights are needs more or better data or even re-designs the network.
continuously adjusted until the error in the calculated A part of the collected facts (data) around (10%), i.e. 5
outputs converges to an acceptable level. The back- facts (projects) is set aside randomly from the set of
propagation algorithm involves the gradual reduction of training facts (projects) [11]. Then these facts are used to
the error between model output and the target output. test the ability of the network to predict a new output
Hence, it develops the input to output mapping by where the absolute difference is calculated for each test
project outcome by the equation (2) [16]:
- Equation (2):
The program is generated through the following sequence 2. One Hidden Layer with Tangent Transfer Function;
of alterations and selecting the model structure that (Table 2B)
provides the minimum RMS value [11]: 3. Two Hidden Layers with Sigmoid Transfer Function in
1. One Hidden Layer with Sigmoid Transfer Function; each; (Table 2C)
(Table 2A) 4. Two Hidden Layers with Tangent Transfer Function in
each; (Table 2D)
Table 2A: Experiments for Determining the Best Model
Model Output No. of Hidden No. of Hidden Nodes Absolute
Input Nodes RMS
No. Node Layers In 1st Layer In 2nd Layer Difference %
1 10 1 1 3 0 7.589891 0.900969
2 10 1 1 4 0 5.491507 0.602400
3 10 1 1 5 0 8.939657 1.046902
4 10 1 1 6 0 7.766429 0.932707
5 10 1 1 7 0 4.979286 0.535812
6 10 1 1 8 0 5.818345 0.647476
7 10 1 1 9 0 4.947838 0.579932
8 10 1 1 10 0 8.887463 1.039825
9 10 1 1 11 0 4.858645 0.507183
10 10 1 1 12 0 5.352388 0.651948
11 10 1 1 13 0 2.476118 0.276479
12 10 1 1 14 0 2.857856 0.428663
13 10 1 1 15 0 4.074554 0.478028
14 10 1 1 20 0 8.065637 1.050137
i.e. Model trials from 1 to 14 has a Sigmoid transfer function.
The first fourteen model trails illustrated that the RMS and RMS value of 1.050137 and the corresponding Absolute
Absolute Difference values changed as the number of Difference value of 8.065637 were achieved in the
hidden nodes in the single hidden layer increased in a fourteenth trial when there was twenty hidden nodes in the
nonlinear relationship, where the lowest RMS value of single hidden layer with a sigmoid transfer function. For
0.276479 and a corresponding Absolute Difference value the remaining twelve model trails the RMS and Absolute
of 2.476118 were achieved in the eleventh trial where Difference values changed consecutively within the above
there were thirteen hidden nodes in the single hidden layer mentioned ranges for each model trial.
with a sigmoid transfer function. On the other side highest
Table 2B: Experiments for Determining the Best Model
Model Output No. of Hidden No. of Hidden Nodes Absolute
Input Nodes RMS
No. Node Layers In 1st Layer In 2nd Layer Difference %
15 10 1 1 3 0 3.809793 0.490956
16 10 1 1 4 0 5.666974 0.703804
17 10 1 1 5 0 3.813867 0.425128
18 10 1 1 6 0 5.709665 0.709344
19 10 1 1 7 0 5.792984 0.634338
20 10 1 1 8 0 2.952316 0.343715
21 10 1 1 9 0 5.629162 0.655106
22 10 1 1 10 0 3.544173 0.387283
23 10 1 1 11 0 5.578666 0.686378
24 10 1 1 12 0 5.772656 0.701365
25 10 1 1 13 0 3.582526 0.380564
26 10 1 1 14 0 4.614612 0.515275
27 10 1 1 15 0 4.806596 0.641098
28 10 1 1 20 0 7.005237 0.826699
i.e. Model trials from 15 to 28 has a Tangent transfer function.
The model trails from 15 to 28 where there is one hidden Difference value of 2.952316 were achieved in the
layer, illustrated that the RMS and Absolute Difference twentieth model trial when there was eight (8) hidden
values changed as the number of hidden nodes/hidden nodes in the single hidden layer. On the other side, with a
layer changed in a nonlinear relationship, where the lowest tangent transfer function, the highest RMS value of
RMS value of 0.343715 and a corresponding Absolute 0.826699 and the corresponding Absolute Difference
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 280
value of 7.005237 were achieved in the twenty eighth consecutively within the above mentioned ranges for each
model trial when there were twenty hidden nodes in the model trial.
single hidden layer. The remaining values changed
Table 2C: Experiments for Determining the Best Model
Model Output No. of Hidden No. of Hidden Nodes Absolute
Input Nodes RMS
No. Node Layers In 1st Layer In 2nd Layer Difference %
29 10 1 2 2 1 9.919941 1.519966
30 10 1 2 2 2 5.170748 0.581215
31 10 1 2 3 1 10.374248 1.413138
32 10 1 2 3 2 11.167767 1.687072
33 10 1 2 3 3 8.013460 1.140512
34 10 1 2 4 1 5.679721 0.643957
35 10 1 2 4 2 5.577789 0.617385
36 10 1 2 4 3 5.448696 0.598400
37 10 1 2 4 4 4.079718 0.492011
38 10 1 2 5 3 4.191063 0.574500
39 10 1 2 5 4 6.024062 0.723419
40 10 1 2 5 5 5.322466 0.654373
41 10 1 2 6 4 7.257790 0.804202
42 10 1 2 6 5 5.158298 0.567479
43 10 1 2 6 6 5.270355 0.545017
i.e. Model trials from 29 to 43 has a Sigmoid transfer function for both hidden layers.
The model trails from 29 to 43 illustrated that the RMS corresponding Absolute Difference value of 11.167767
and Absolute Difference values changed as the number of were achieved in the model trial number (32) when there
hidden nodes per each hidden layer increased in a were two hidden layers with three hidden nodes in the fist
nonlinear relationship, where the lowest RMS value of layer and two hidden nodes in the second hidden layer and
0.492011 and a corresponding Absolute Difference value having a sigmoid transfer function. For the remaining
of 4.079718 were achieved in the model trial number (37) thirteen model trails the RMS and Absolute Difference
when there were two hidden layers with four hidden nodes values changed consecutively within the above mentioned
in each layer and having a sigmoid transfer function. ranges for each model trial having a sigmoid function in
Contrarily, the highest RMS value of 1.687072 and the each layer.
Table 2D: Experiments for Determining the Best Model
Model Output No. of Hidden No. of Hidden Nodes Absolute
Input Nodes RMS
No. Node Layers In 1st Layer In 2nd Layer Difference %
44 10 1 2 2 1 4.364562 0.499933
45 10 1 2 2 2 3.551318 0.380629
46 10 1 2 3 1 4.787220 0.493240
47 10 1 2 3 2 6.267891 0.852399
48 10 1 2 3 3 6.515138 0.829739
49 10 1 2 4 1 3.458081 0.481580
50 10 1 2 4 2 9.249286 1.158613
51 10 1 2 4 3 4.735680 0.552350
52 10 1 2 4 4 7.445228 0.991062
53 10 1 2 5 3 7.729862 1.105441
54 10 1 2 5 4 9.807989 1.180131
55 10 1 2 5 5 6.060798 0.657344
56 10 1 2 6 4 3.213154 0.355932
57 10 1 2 6 5 4.381631 0.490479
58 10 1 2 6 6 4.731568 0.502131
i.e. Model trials from 44 to 58 has a Tangent transfer function for both hidden layers.
The model trails from 44 to 58 illustrated that the RMS when there was two hidden layers with six hidden nodes
and Absolute Difference values changed as the number of in the first hidden layer and four hidden nodes in the
hidden nodes per each hidden layer increased in a second hidden layer and with a tangent transfer function in
nonlinear relationship, where the lowest RMS value of each layer. On the other side, the highest RMS value of
0.355932 and a corresponding Absolute Difference value 1.180131 and the corresponding Absolute Difference
of 3.213154 were achieved in the model trial number (56), value of 9.807989 were achieved in the model trial
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 281
number (54) when there was two hidden layers with five through the trail and error process are presented in
hidden nodes in the fist layer and four hidden nodes in the (Table 3) and (Fig. 4).
second hidden layer and with a tangent transfer function in Model Trial Number Eleven with the following Eight
each layer. For the remaining thirteen model trails the Design Parameters, which are [11]:
RMS and Absolute Difference values changed 1. Input layer with 10 Neurons (nodes);
consecutively within the above mentioned ranges for each 2. One hidden layer with 13 Neurons (nodes);
and with a sigmoid function in each layer [11]. 3. Output layer with 1 Neuron (node);
The recommend model for this prediction problem is that 4. With a Sigmoid Transfer Function;
with the least RMS value from all the fifty-eight trails and 5. Learning rate automatically adjusted by the program;
error process [16]. This is trial number eleven [11]. 6. Training Tolerance = 0.10 (Adjusted by Program);
As a result, from training phase the characteristics of the 7. Root Mean Square Error = 0.276479;
satisfactory Neural Network Model that was obtained 8. Absolute Mean Difference % = 2.476118.
Table 3: Characteristics of the Best Model
No. of
No. of input No. of nodes/ No. of output
Model hidden LR TF RMS
nodes hidden layer nodes
layers
Back Sigmoid
11 10 1 13 1 0.276479
propagation function
LR: Learning Rule; TF: Transfer Function; RMS: Root Mean Square Error.
As it is clear the correct predicted model outputs of the This demonstrates a very high accuracy for the proposed
percentage of site overhead costs differ from the actual model and the viability of the neural network as a
real life project percentage of site overhead costs value powerful tool for modeling the assessment of the building
with a value under (2.476%) which is the designed construction site overhead cost percentage for projects
model absolute difference%, which is assumed to be constructed in Egypt [11].
acceptable.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 282
1 Output
ANNs Building Construction Site Overhead Cost Percentage
F1 1
F2
2 Overhead
(%)
F10 13
Wij Wjo
Fig. 4. Structure of the Best Model. [11]
7. Summary 8. CONCLUSIONS
Construction firms should carefully examine contract The following conclusions are drawn from this research:
conditions and perform all the necessary precautions to 1. Through literature review potential factors that
make sure that project site overhead costs factors are influence the percentage of site overhead costs for
properly anticipated for and covered within the total building construction projects were identified. Ten
tender price. The study conducted a survey that factors were identified;
investigated the factors affecting project's site overhead 2. The analysis of the collected data gathered from fifty-
cost for building construction projects in the first and two real-life building construction projects from Egypt
second categories of construction companies. An ANN illustrated that project's duration, total contract value,
model was developed to predict the percentage of site projects type, special site preparation needs and
overhead cost for building construction projects in Egypt project's location are identified as the top five factors
during the tendering process. A sample of building that affect the value of the percentage of site overhead
projects was selected as a test sample for this study. The costs for building construction projects in Egypt;
impacts of different factors on the site overhead costs 3. Nature of the client, type of the contract and contractor-
were deeply investigated. The survey results illustrated joint venture are the lowest affecting factors in the
that site overhead costs are greatly affected by many percentage of site overhead costs for building
factors. Among these factors come project type, size, construction projects in Egypt;
location, site conditions and the construction technology. 4. A satisfactory Neural Network model was developed
All of these factors make the detailed estimation of such through fifty-eight experiments for predicting the
overhead costs a more difficult task. percentage of site overhead costs for building
Hence, it is expected that a lump-sum assessment for such construction projects in Egypt for the future projects.
cost items will be a more convenience, easy, highly This model consists of one input layer with ten neurons
accurate, and quick approach. Such approach should take (nodes), one hidden layer having thirteen hidden nodes
into consideration the different factors that affect site with a sigmoid transfer function and one output layer.
overhead cost. It was found that an ANN-Based Model The learning rate of this model is set automatically by
would be a suitable tool for site overhead cost assessment. the N-Connection V2.0 while the training and testing
tolerance are set to 0.1;
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 283
5. The results of testing for the best model indicated a Engineering (Morgantown, W. Virginia), Vol.38,
testing root mean square error (RMS) value of No.8, August 1996.
0.276479; and [14] Kim ln Ho, A study on the methodology of rational
planning and decision of military facility construction
6. Testing was carried out on five new facts (Projects) that
cost, Journal of Architectural Institute of Korea,
were still unseen by the network. The results of the Vol.10, No.6, Ko-Korean, 1994.
testing indicated an accuracy of (80%). As the model [15] Neil, Construction cost estimating for project control.
wrongly predicted the percentage of site overhead costs Prentice-Hall, Englewood Cliffs, N.J, (1981).
for only one project (20%) from the testing sample. [16] N-Connection V2.0 Professional Software User Guide
and Reference Manual (1997), California Scientific
Software.
9. References [17] Peurifoy and Oberlender, Estimating construction
costs, 4th Ed., McGraw Hill, New York, (1989).
[1] Ahuja and Campbell, Estimating from concept to [18] Sadi Assaf, Abdulaziz Bubshait, Solaiman Atiyah, and
completion. Prentice Hall, Englewood Cliffs, N.J, Mohammed AL-Shahri, Project Overhead Costs in
(1988). Saudi Arabia, Cost Engineering Journal, Vol. 41, No.
[2] Akintoye A. (2000). Analysis of Factors Influencing 4, (1999).
Project Cost Estimating Practice. Construction [19] Yong-Woo Kim, Glenn Ballard, Case Study-
Management and Economics. 18(1), 77-89. Overhead Costs Analysis, Proceedings IGLC-10,
[3] Alcabes, J. (AACE, 1988), Organizational concept Gramado, Brazil, (August, 2002).
for a coordinated estimating, cost control, and
scheduling division.
[4] Assaf Sadi, Abdulaziz Bubshait, Solaiman Atiyah, and Ismaail Yehia Aly ElSawy, has
Mohammed AL-Shahri, The management of received his M.Sc. and B.Sc. degrees in
construction company overhead costs, International Construction and Building Engineering,
Journal of Project Management, Vol.19, No.5, 2001. College of Engineering and Technology,
[5] Bannes Lorry T., Fee analysis: A contractor's from Arab Academy for Science,
approach, Transactions of the American Association Technology and Maritime Transport,
of Cost Engineers, Morgantown, WV, USA, 1994. Alexandria, Egypt, 2010 and 2002. He
[6] Becica Matt, Scott Eugene R. and Willett Andrew B., Joined in December (2004) the Egyptian
Evaluating responsibility for schedule delays on utility Ministry of Electricity and Power as a
construction projects, Proceedings of the American Research Engineer in the Ministries
Power Conference, Illinois Institute of Technology, National Research Center. He then joined the academic
Chicago, IL, USA, (1991). field in September (2008), as a Demonstrator (B.Sc.) then
[7] Clough, R., and Sears, G. (1991), Construction project Assistant lecturer (M.Sc.) at the Civil Engineering
management. Wiley, New York. Department, Thebes Higher Institute of Engineering. He
has published more than 6 research papers in
[8] Hatem A. A. (2009), "Developing a Neural Networks
International/National Journals and Refereed International
Model for Supporting Contractors In Bidding
Conferences. He is interested in the implementation of
Decision In Egypt", A thesis submitted to Zagazig
Artificial Intelligence in Construction Project Management,
University in partial fulfillment to the requirement for
and Construction Projects Financial Management.
the Master of Science Degree.
[9] Hegazy T. and Moselhi O. (1995). Elements of Cost
Estimation: A Survey in Canada and the United
States. Cost Engineering. 37(5), 27-31.
[10] Holland, N. and Hobson, D. (1999). Indirect cost
categorization and allocation by construction
contractors. Journal of Architectural Engineering,
ASCE, 5(2) 49-56.
[11] Ismaail Y. El-Sawy (2010), "Assessment of Overhead
Cost for Building Construction Projects", A Thesis
Submitted to Arab Academy for Science, Technology
and Maritime Transport in partial fulfillment of the
requirements for Master of Science Degree.
[12] Ismaail Y. El-Sawy, Mohammed Abdel Razek, and
Hossam E. Hosny (2010). Factors Affecting Site
Overhead Cost for Building Construction Projects
Journal of Al Azhar University Engineering Sector,
JAUES, Issue 3/2010, May 2010, Cairo, Egypt.
[13] Jones Walter B., Spreadsheet Checklist to Analyze
and Estimate Prime Contractor Overhead, Cost
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 284
2
Computer Science Department, Faculty of Computers and Information's,
Menofia University, Menofia, Egypt.
1. Introduction
The developments in science and technology have made it
possible to use biometrics in applications where it is
required to establish or confirm the identity of
individuals. Applications such as passenger control in Fig 1: The image (Img 141 1 1) from the UBIRIS database
airports, access control in restricted areas, border control,
database access and financial services are some of the All these advantages let the iris recognition be a
examples where the biometric technology has been promising topic of biometrics and get more and more
applied for more reliable identification and verification. attention [7, 8, 26]. Even though iris is seen as the most
Biometrics is inherently a more reliable and capable reliable biometric measure, it is still not in everyday use
technique to identity human's authentication by his or her because of the complexity of the systems. In an iris
own physiological or behavioral characteristics. The recognition system, iris location is an essential step that
features used for personnel identification by current spends nearly more than half of the entire processing time
[36]. The correctness of iris location is required for the
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 285
latter processes such as normalization, feature extraction Similar to the matching scheme of Daugman, they
and pattern matching. For those reasons, to improve the sampled binary emergent frequency functions to form a
speed and accuracy of iris location becomes nontrivial. feature vector and used Hamming distance for matching.
The algorithm proposed in this work is improvement of Kumar et
the matching process in the algorithms proposed by al. [3] utilized correlation filters to measure the
Daugman [8, 9]. The United Arab Emirates Expellees consistency of iris images from the same eye. The
Tracking and Border Control System [22] is an correlation filter of each class was designed using the two-
outstanding example of the technology. dimensional Fourier transforms of training images. If the
In general, the process of iris recognition system consists correlation output (the inverse Fourier transform of the
of: (i) image acquisition, (ii) Preprocessing the iris image product of the input images Fourier transform and the
including iris localization, image normalization and polar correlation filter) exhibited a sharp peak, the input image
transformation, (iii) iris Feature extraction and (iv) iris was determined to be from an authorized subject,
matching. otherwise an impostor one. Bae et al. [16] projected the
iris signals onto a bank of basis vectors derived by
1.1 Related Work independent component analysis and quantized the
resulting projection coefficients as features. In another
The research in the area of iris recognition has been approach by Ma et al. [19] and Even Symmetry Gabor
receiving considerable attention and a number of filters [10] are used to capture local texture information of
techniques and algorithms have been proposed over the the iris, which are used to construct a fixed length feature
last few years. Flom and Safir first proposed the concept vector.
of automated iris recognition in [18]. The approach In the last year only, the iris takes the attention of many
presented by Wildes [26] combines the method of edge researchers and different ideas are formulated and
detection with Hough transform for iris location. published. For example, in [1] a bi-orthogonal wavelet
However, the parameters need to be precisely set and based iris recognition system, is modified and
lengthy location time is required. Daugman's method is demonstrated to perform o_-angle iris recognition. An
developed first using the integro-differential operator [10] efficient and robust segmentation of noisy iris images for
for localizing iris regions along with removing possible non-cooperative iris recognition is described in [32]. Iris
eyelid noises. In the past few years, some methods made image segmentation and sub-optimal images is discussed
certain improvement based on the Daugman's method [8, in
9]. Bowyer et al. [17] recently presented an excellent [13]. Comparison and combination of iris matchers for
review of these methods. However, at this time, essentially reliable personal authentication are introduced in [2].
all of the large scale implementations of iris recognition Noisy iris segmentation, with boundary regularization and
are based on the Daugman iris recognition algorithms [8]. reflections removal, is discussed in [28].
The difference between a pair of iris codes was measured
by their Hamming distance. Sanchez-Reillo and Sanchez- 1.2 Outline
Avila [27] provided a partial implementation of the
algorithm by Daugman. Boles and Boashash [34] In this paper, we first present the active contour models
calculated a zero-crossing representation of one- for iris preprocessing (segmentation step) which is a
dimensional wavelet transform at various resolution levels crucial step to the success of any iris recognition system,
of a concentric circle on an iris image to characterize the since data that is falsely represented as iris pattern data
texture of the iris. Iris matching was based on two will corrupt the biometric templates generated, thus
dissimilarity functions. [29] Decomposed an iris image resulting in poor recognition rates. Once the iris region is
into four levels using 2-D Haar wavelet transform and successfully segmented from an eye image, the next stage
quantized the fourth-level high-frequency information to is to transform the iris region so that it has fixed
form an 87-bit code. A modified competitive learning dimensions (normalization) in order to allow comparisons
neural network was adopted for classification. Tisse et al. using Daugman rubber sheet model. After that the 1-D
[5] analyzed the iris characteristics using the analytic log-Gabor filter is used to extract real valued template for
image constructed by the original image and its Hilbert the normalized iris.
transform. Emergent frequency functions for feature
extraction were in essence samples of the phase gradient 2. Iris Localization Techniques
fields of the analytic image's dominant components [17,
31]. It is the stage of locating the iris region in an eye image,
whereas mentioned the iris region is the annular part
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 286
(2)
In order to improve accuracy Ritter et al. use the variance to push the vertices inward. The magnitude of the
image, rather than the edge image. A point interior to the external forces is defined as:
pupil is located from a variance image and then a discrete ^
F ext , i I ( V i ) I (V i F ext , i ) (11)
circular active contour (DCAC) is created with this point
as its center. The DCAC is then moved under the where I(Vi ) is the grey level value of the nearest neighbor
influence of internal and external forces until it reaches to Vi. ^ is the direction of the external force for each
F ext ,i
equilibrium, and the pupil is localized. vertex and it is defined as a unit vector given by:
^ C Vi
2.3 Discrete Circular Active Contour F ext , i
C Vi
(12)
Ritter (2003) et al. [25] proposed a model which detects
Therefore, the external force over each vertex can be
pupil and limbus by activating and controlling the active
written as:
contour using two defined forces: internal and external
(13) ^
forces. F ext , i F ext , i F
The internal forces are responsible to expand the contour
ext ,i
using 1-D Gabor filter, since the convolution of a reference point proposed by Arvacheh [6], which is the
separable eyelash with the Gaussian smoothing function virtual center of a pupil with radius equal to zero
results in a low output value. (linearly-guessed center). The experiments demonstrate
Thus, if a resultant point is smaller than a threshold, it is that the linearly-guessed center provides much better
noted that this point belongs to an eyelash. Multiple recognition accuracy. The linearly-guessed center is
eyelashes are detected using the variance of intensity. If equivalent to the technique used by Joung et al. [4].
the In addition, most normalization approaches based on
variance of intensity values in a small window is lower Cartesian to polar transformation unwrap the iris texture
than a threshold, the center of the window is considered into a fixed-size rectangular block. For example, in Lim et
as a point in an eyelash. The two features combined with al. method, after finding the center of pupil and the inner
a and outer boundaries of iris, the texture is transformed
connectivity criterion would lead to the decision of into polar coordinates with a fixed resolution. In the
presence of eyelashes. In addition, an eyelash detection radial direction, the texture is normalized from the inner
method is also proposed by Huang et al. that uses the edge boundary to the outer boundary into 60 pixels. The
information obtained by phase congruency of a bank of angular resolution is also fixed to a 0:8o over the 360o,
Log-Gabor filters. The edge information is also infused which produces 450 pixels in the angular direction. Other
with the region information to localize the noise regions researchers such as Boles and Boashash, Tisse et al. [5].
[15], as in Figure 4. And Ma et al. [20] also use the fixed size polar
transformation model.
However, the circular shape of an iris implies that there
are different number of pixels over each radius.
Transforming information of different radii into same
resolution results in different amount of interpolations,
Figure 4: illustrates the perfect iris localization, where black regions denote and sometimes loss of information, which may degrade
detected eyelids and eyelashes regions.
the performance of the system.
3. Normalization
3.1 Daugman's Rubber Sheet Model
Once the iris region is successfully segmented from an eye
image, the next stage is to transform the iris region so that It transforms a localized iris texture from Cartesian to
it has fixed dimensions in order to eliminate dimensional polar coordinates. It is capable of compensating the
inconsistencies between iris regions, and to allow unwanted variations due to distance of eye from camera
comparisons. The dimensional inconsistencies between (scale) and its position with respect to the camera
eye images are mainly due to the stretching of the iris (translation). The Cartesian to polar transformation is
caused by pupil dilation from varying levels of defined as
illumination. Other sources of inconsistency include, I (( x ( r , ), y ( r , )) I ( r , )
varying imaging distance, rotation of the camera, head
(15)
tilt, and rotation of the eye within the eye socket. The
where
normalization process will produce iris regions, which
x ( r , ) (1 r ) x p ( ) r x i ( ),
have the same constant dimensions, so that two images of
the same iris under different conditions will have the y ( r , ) (1 r ) y p ( ) r y i ( ),
same characteristic features at the same spatial location. and
A proper normalization technique is expected to x p ( ) x p 0 ( ) r p cos( ),
transform the iris image to compensate these variations.
Most normalization techniques are based on transforming
y p ( ) y p 0 ( ) r p sn( ),
iris into polar coordinates, known as unwrapping process. x i ( ) x i 0 ( ) r i cos( ),
Pupil boundary and limbus boundary are generally two y i ( ) y i 0 ( ) r i sin( ),
non-concentric contours. The non-concentric condition
where I(x; y) is the iris region image, (x; y) are the
leads to different choices of reference points for
original Cartesian coordinates, (r, ) are the corresponding
transforming an iris into polar coordinates. Proper choice
of reference point is very important where the radial and normalized polar coordinates, and (xp; yp) and (xi ; yi ) are
angular information would be defined with respect to this the coordinates of the pupil and iris boundaries along the
point. Unwrapping iris using pupil center is proposed by _ direction. The process is inherently dimensionless in the
Boles and Boashash [34] and Lim et al. [14]. Another angular direction. In the radial direction, the texture is
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 289
1-D log-Gabor filter (i.e. multiplied with the 1-D log- between the two patterns should equal 0.5. This occurs
Gabor filter in the frequency domain). because independence implies that, the two bit patterns
The filtered row signal is transferred back to the spatial will be totally random, so there is 0.5 chance of setting
domain via inverse fast Fourier transform (IFFT). The any bit to 1, and also to zero. Therefore, half of the bits
spatial domain signal is then transferred to a filtered will agree and half will disagree between the two patterns.
image in the spatial domain, and hence the biometric code If two patterns are derived from the same iris, the HD
(template) is obtained from the filtered image. between them will be close to 0.0, since they are highly
Figure 7 shows the step-by-step process of the 1-D log correlated and the bits should agree between the two iris
Gabor filter feature extraction. codes.
Daugman [8] uses this matching metric as following, the
simple Boolean Exclusive-OR operator (XOR) applied to
5. Matching the 2048 bit phase vectors that encode any two iris
patterns, masked (AND'ed) by both of their corresponding
Once an iris image relevant texture information extracted, mask bit vectors to prevent noniris artifacts from
the resulting feature vector (iris template) is compared influencing iris comparisons. The XOR operator
with enrolled iris templates. The template generated needs detects disagreement between any corresponding pair of
a corresponding matching metric, which gives a measure bits, while the AND operator ensures that the compared
of similarity between two iris templates. This metric
bits are both deemed to have been uncorrupted by
should give one range of values when comparing
eyelashes, eyelids, specular reflections, or other noise. The
templates generated from the same eye, known as intra-
norms of the resultant bit vector and of the AND'ed
class comparisons, and another range of values when
comparing templates created from different irises, known mask vectors are then measured in order to compute the
as extra-class comparisons. fractional HD (Equation 5.18), as the mea-sure of
These two cases should give distinct and separate values, dissimilarity between any two irises, whose two phase
so that a decision can be made with high confidence as to code bit vectors are denoted codeP; codeQ and whose
whether two templates are from the same iris, or from two mask bit vectors are denoted maskP; maskQ:
different irises. The following subsections introduce some ( codeP coodeQ ) maskP maskQ (18)
HD
famous matching metrics, and finally the scalar product maskP maskQ
(SP) method. The denominator tallies the total number of phase bits
that mattered in iris comparisons after artifacts such as
5.1 The Normalized Hamming Distance eyelashes, eyelids, and specular reflections were
discounted, so the resulting HD is a fractional measure of
The Hamming distance (HD) gives a measure of how dissimilarity; 0.0 would represent a perfect match.
many bits are the same between two bit patterns,
especially if the template is composed of binary values. 5.2 The Weighted Euclidean Distance
Using the HD of two bit patterns, a decision can be made
as to whether the two patterns were generated from The weighted Euclidean distance (WED) can be used to
different irises or from the same iris. For example, compare two templates, especially if the template is
comparing the bit patterns P and Q, the HD is defined as composed of integer values. It gives a measure of how
the sum of disagreeing bits (sum of the exclusive-OR similar a collection of values are between two templates.
between P and Q) over N, the total number of bits in each This metric is employed by Zhu et al. [37] and is defined
bit pattern. It is known as the normalized HD, and is as:
defined as: N
( fi fi p ) 2
1 N WED ( P )
( iP ) 2
HD Pi Q i i 1
N i1 (19)
(17) where fi is the i th feature of the unknown iris, and f i P is
Since an individual iris region contains features with high
the i th feature of iris template k, and i P i is the standard
degrees of freedom, each iris region will produce a bit-
pattern which is independent to that produced by another deviation of the i th feature in iris template k. The
iris, on the other hand, two iris codes produced from the unknown iris template is found to match iris template k,
same iris will be highly correlated. when the WED is a minimum at k.
In case of two completely independent bit patterns, such
as iris templates generated from different irises, the HD
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 291
10], template from the database of 10 elements, it will between the compared iris template and the template number 80, hence the
two are templates for the same iris image.
work as shown in Table 1.
database respectively, and was found to give good correct
recognition rates compared to other matching methods as
shown in Table 2.
Matching measure Correct recognition rate (CRR)%
WED 98.73
SP 98.26
HD 98.22
Table 2: The correct recognition rates achieved by three matching measures
using the CASIA and UBIRIS database.
Figure 11: The matching of (Img 2 1 4) iris image from (UBIRIS database)
with the template number 9 from 150 templates, where as shown
cos( ) = 1 between the compared iris template and the template number
Figure 8: The obtained ROC curves to three different matching measures
using the CASIA database. 9, hence the two are templates for the same iris image.
cos( ) = 0:18 is between the compared iris template and the template Wavelets, Multiresolution and Information Processing, 1 (1)
number 145, hence the two templates are not so similar and (2003), 1-17.
also they are not templates for the same iris image. [8] J. Daugman, How Iris Recognition Works, IEEE
Transactions on Circuits and Systems for Video Technology,
14 (1) (2004), 21-30.
7. Conclusion [9] J. Daugman, The Importance of Being Random: Statistical
Principles of Iris Recognition, Pattern Recognition, 36 (2)
Here we have presented an active contour model, in order (2003), 279-291.
to compensate for the iris detection error caused by two [10] J.Daugman, High Confidence Visual Recognition of
circular edge detection operations. After perfect iris Persons by a Test of Statistical Independence, IEEE
localization, the segmented iris region is normalized Transactions on Pattern Analysis and Machine Intelligence,
(transformed into polar coordinates) to eliminate 15 (11) (1993), 1148-1160.
[11] J. Daugman, Statistical Richness of Visual Phase
dimensional inconsistencies between iris regions. This
Information: Update on Recognizing Persons by Iris
was achieved by using Daugman's rubber sheet model, Patterns, Int. I. J. Computer Vision, 45 (1) (2001), 25-38
where the iris is modeled as a flexible rubber sheet, which [12] J. Havlicek, D. Harding, A. Bovik, The multi-component
is unwrapped into a rectangular block with constant polar AM-FM image representation, IEEE Trans. Image Process,
dimensions ( 20 240 ) elements. 5 (1996), 1094-1100.
The next stage is to extract the features of the iris from [13] J. R. Matey, R. Broussard, L. Kennell, Iris image
the normalized iris region. This was done by the segmentation and sub-optimal images, Image and Vision
Computing, 28 (2010), 215-222.
convolution of the 1-D Log-Gabor filters with the
[14] J. Huang, Y. Wang, T. Tan, J. Cui, A new iris
normalized iris region. After that the convoluted iris
segmentation method for recognition, Proceedings of the
region is reshaped to be a template of (1_4800) real 17th International Conference on Pattern Recognition, ICPR,
valued elements. 3 (2004), 554-557.
Finally the scalar product matching scheme is used, [15] J. Huang, Y. Wang, T. Tan, J. Cui, A new iris
which give the cos( ) between two templates. If segmentation method for recognition, Proceedings of the
17th International Conference on Pattern Recognition, ICPR,
cos( ) = 1 between two templates P and Q this means
3 (2004), 554557.
that, the two templates were deemed to have been [16] K. Bae, S. Noh, J. Kim, Iris feature extraction using
generated from the same iris, otherwise they have been independent component analysis, Proceedings of 4th
generated from different irises. International Conference on Audio- and Video-Based
Biometric Person Authentication, (2003), 838-844.
[17] K. W. Boweyer, K. Hollingsworth, Patrick J. Flynn, Image
References understanding for iris biometrics: A survey, Computer
[1] A. Abhyankara, S. Schuckers, A novel biorthogonal wavelet Vision and Image Understanding, 110 (2008), 281-307.
network system for o_ angle iris recognition, Pattern [18] L. Flom, A. Safir, Iris recognition system, U.S. Patent, 4
Recognition, 43 (2010), 987-1007. (1987), 394-641.
[2] A. Kumar, A. Passi, Comparison and combination of iris [19] L. Ma, T. Tan, Y. Wang, D. Zhang, Efficient Iris
matchers for reliable personal authentication, Pattern Recognition by Characterizing Key Local Variations, IIEEE
Recognition, 43 (2010), 1016-1026. Transactions on Image Processing, 13 (2004), 739-750.
[3] B. Kumar, C. Xie, J. Thornton, A. Bovik, Iris verification [20] L. Ma and T. Tan and Y. Wang and D. Zhang, Personal
using correlation filters, Proceedings of 4th International identification based on iris texture analysis, IEEE
Conference on Audio- and Video- Based Biometric Person Transactions on Pattern Analysis and Machine Intelligence,
Authentication, (2003), 697-705. 25 (12) (2003), 1519-1533.
[4] B. J. Joung and C. H. Chung and K. S. Lee and W. Y. Yim [21] L. Masek, P. Kovesi, MATLAB Source Code for a
and S. H. Lee, On Improvement for Normalizing Iris Region Biometric Identification System Based on Iris Patterns, The
for a Ubiquitous Computing, Proceedings of International University of Western Australia, (2003).
Conference on Computational Science and Its Applications [22] M. Almualla, The UAE Iris Expellees Tracking and
ICCSA, Singapore, (2005), 1213-1219. Border Control System in: Biometrics Consortium
[5] C. Tisse, L. Martin, L. Torres, M. Robert, Person September, Crystal City, VA, (2005).
Identification Technique Using Human Iris Recognition, [23] N. Duta, A survey of biometric technology based on hand
Proc. Vision Interface, (2002), 294-299. shape, Pattern Recognition, 42 (2009), 2797-2806.
[6] E. M. Arvacheh, A Study of Segmentation and [24] N. Ritter, Location of The Pupil-iris Border in Slit-lamp
Normalization for Iris Recognition Systems, University of Images of The Cornea, Proceedings of the International
Waterloo, Waterloo, Ontario, Canada, (2006). Conference on Image Analysis and Processing, (1999).
[7] J. Daugman, Demodulation by Complex-valued Wavelets for [25] N. Ritter, J. R. Cooper, Locating the iris: A first step to
Stochastic Pattern Recognition, International Journal of registration and identification, Proceedings of the 9th
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 294
of fitness estimation. Section 3 describes the proposed The concept of Pareto-optimality helps to overcome this
constructive multi-objective RNN. Section 4 presents the problem of comparing solutions with multiple fitness
experimental results obtained on the classification of the values. A solution is Pareto optimal if it is not dominated
TIMIT vowels. by any other solutions. A Pareto optimal solution is
defined as follows: a decision vector x is said to dominate
a decision vector y if and only if i 1,, k :
2. Multi-objective optimization f i ( x ) f i ( y ) j1,, k : f j ( x ) f j ( y ) . The decision vector x is
Pareto optimal if and only if x is non-dominated [5].
In this section, we briefly present the formulation of a
multi-objective optimization problem (MOO) such as The Pareto approach is based on two aspects: the ranking
some required notions about Pareto based multi-objective and the selection. The ranking methods are the following:
optimization and some concepts relating to Pareto - NDS (Non Dominated Sorting) : In this method,
optimality [2], [12]. the rank of an individual is the number of
The scenario considered in this paper involves an arbitrary solutions dominating this individual plus one
optimization problem with k objectives, which are, [12].
without loss of generality, all to be minimized and all - WAR (Weighted Average Ranking) : In this
equally important, i.e., no additional knowledge about the method, population members are ranked
problem is available. We assume that a solution to this separately according to each objective function.
problem can be described in terms of a decision vector Fitness equal to the sum of the ranks in each
denoted by: objective is assigned [2].
- NSGA (Non-dominated Sorting Genetic
x ( x1, x2 ,..., xn ) Algorithm) [21]: In this method, all non-
(1)
dominated individuals of the population have
where x1, x2 ,..., xn are the variables of the problem. rank 1. Then, these individuals are removed and
the next set of non-dominated individuals are
Mathematically, the multi-objective optimization problem identified and assigned next rank [21].
is stated by : Several methods of selection based on the concept of
min F ( x) ( f 1( x), f 2 ( x),..., f k ( x)), dominance are:
MOO : (2)
s.c. x C. - Tournament based selection [2]: at each
tournament, two individuals A and B fall in
where fi are the decision criteria and k is the number of competition against a set of t dom individuals in
objective function. the population. If the competitor A dominates all
x*
individuals and all the other competitor B is
An optimization problem searches the action where the
dominated by at least one individual, then
constraints C are satisfied and the objective function F(x)
individual A is selected.
is optimized.
- Pareto reservation strategy [5]: in this method,
In practical applications, there is no solution that can the non-dominated individuals are always saved
minimize all of the k objectives. As a result, MOO to the next generation.
problems tend to be characterized by a family of - Ranking method [2]: the cost associated with a
alternatives solutions. new individual is determined by the relative
distance in objective space with respect to
The approach most used is to weight and sum the separate individuals not dominated of the current
fitness values in order to produce just a single fitness population.
value for every solution, thus allowing the GA to
determine which solutions are fittest as usual. However, as
noted by Goldberg [14], the separate objectives may be 3. Recurrent neural netw orks design by
difficult or impossible to manually weight because of
unknowns in the problem. Additionally, weighting and
means of multi-objective genetic algorithm
summing could have a detrimental effect upon the
We shall now tackle the problem of finding RNN having the
evolution of acceptable solutions by the GA (just a single
smallest recognition error and the least number of hidden units.
incorrect weight can cause convergence to an For this reason, we formulate the problem as optimisation
unacceptable solution). algorithm, more specifically, as a matter of MOO. In order to
solve it we shall use an algorithm based on Pareto optimality that
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 298
severity of mutation and the number of hidden units 3.4 Multi-objective optimization
to add.
Score (i ) A promising approach for performing optimization
T (i ) 1 N (3) problems is the MOGA aiming at producing Pareto
Score ( k )
k 1 optimal solutions [11]. The key concept here is
dominance. However, the success of a Pareto optimal GA
HU (i ) min T (i )( max min )
(4) depends largely on its ability to maintain diversity.
Usually, this is achieved by employing niching techniques
where Score(i) represents the score of the such as fitness sharing [5] and the inclusion of some
i
th
individual, min an max are respectively the useful measures applied to other models, such as negative
minimum and maximum number of hidden units to correlation or mutual information [17]. The MOGA
be add, is a random value between 0 and 1. employed in this work can be described as a niched Pareto
GA with NSGA [21] and tournament selection [2]. The
Once the number of units to be added is
algorithm uses a specialised tournament selection
determined, we modify the network structure under
approach, based on the concept of dominance.
the new constraints and the connections weights of
these units are randomly initialized. The proposed algorithm is based on the concept of Pareto
optimality [19]. We consider a population of networks
- Remove hidden units: This type of mutation is used
where the i th individual characterised by a vector of
to remove the hidden units that do not contribute to
improve recognition of the network. The process objectives values. In fact, the population has N individuals
of deleting a hidden unit occurs as follows. In the and M objectives are considered. In our study, tree
first step, we seek the inactive unit among hidden objectives are considered.
units of the network. This is done by calculating the In this paper, we define the following four objectives:
score of each hidden unit using equation (5). It
calculates the difference in score of RNN with and - Objective of performance: The performance of
without the hidden unit. The unit having the lowest RNN is given by its generalization rate.
fitness is eliminated. - Mutual information: The mutual information
S u (i ) Score(i ) Scoreu (i ) between RNN f i and f j is given by equation (6) :
(5)
1 2
where Score(i) is the generalisation rate of the O MI ( f i , f j )
2
log( 1 ij ) (6)
i th RNN, Scoreu (i ) is the generalization rate of the
ith RNN without the uth unit. where ij is the correlation coefficient between
the networks. The objective is the average of
mutual information between each pair of networks
RNN
[18].
4. Experimental results below this frequency. These filters are applied to the log
of the magnitude spectrum of the signal, which is
In this section, we evaluate and compare the described and estimated on a short-time basis.
the proposed evolutionary constructive RNN for
continuous speech recognition on the maco-class of 4.2 Discussion
vowels of TIMIT speech corpus [1].
In the experiments below, the number of hidden units for
4.1 Database description networks of the initial population was selected uniformly
between 1 and 5. Each network has 12 input units
The third component is a phoneme recognition representing the 12 MFCC coefficients and 20 output units
module. The speech database used is the DARPA TIMIT representing the TIMIT vowels. Table 1 represents the
acoustic-phonetic continuous speech corpus which parameter setting.
contains: /iy/, /ih/, /eh/, /ey/, /ae/, /aa/, /aw/, /ay/, /ah/, /ao/, In this section, results produced by the proposed model
/oy/, /ow/, /uh/, /uw/,/ux/, /er/, /ax/, /ix/, /axr/ and /ax-h/. will be presented and compared with results produced by
The corpus contains 13 699 phonetic unit for training and the Elman model using 30 hidden units the GA and the
4041 phonemes for testing. Elman model using 16 hidden units (the best topology
Speech utterance was sampled at a sampling rate of 16 given by the proposed model).
KHz using 16 bits quantization. Speech frames are filtered Table 1: Learning parameters of the proposed model
by a first order filter. After the pre-emphasis, speech data
Parameter name Value
consists of a large amount of samples that present the
Learning rate for the training of the Elman model 0.5
original utterance. Windowing is introduced to effectively
Epochs number for the trainig of the Elman model 100
process these samples. This is done by regrouping speech
Mutation rate for the standard GA 0.8
data into several frames. A 256 sample window that could
Crossover rate for the standard GA 0.4
capture 16 ms of speech information is used. To prevent
Structural mutation rate 0.2
information lost during the process, an overlapping factor
Parametric mutation rate 0.3
of 50% is introduced between adjacent frames.
Generation number of the population of networks 20
Thereafter, mel frequency cepstral analysis was applied to
extract 12 mel cepstrum coefficients (MFCC) [8].
Among all parameterization methods, the cepstrum has The learning process of the GA used for comparison is the
been shown to be favourable in speech recognition and is following. First, a population of chromosomes is created
widely used in many automatic speech recognition and initialised randomly. Then, a roulette selection is used
systems [23]. The cepstrum is defined as the inverse to select individuals to be reproduced. Thereafter, a one-
Fourier transform of the logarithm of the short-term power point crossover operator is used to produce new
spectrum of the signal. The use of a logarithmic function individuals. During crossover process, pairs of genomes
permits us to deconvolve the vocal tract transfer function are mated by taking a randomly selected string of bits
and the voice source. Consequently, the pulse sequence from one and inserting it into the corresponding place in
originating from the periodic voice source reappears in the the other, and vice versa. After that, a classic mutation
cepstrum as a strong peak in the quefrency domain. The operator is used to mate these individuals. The classic
derived cepstral coefficients are commonly used to mutation operator exchanges a random selected gene with
describe the short-term spectral envelope of a speech a random value within the range of the gene's minimum
signal. The advantage of using such coefficients is that value and the gene's maximum value. 40% of the best
they induce a data compression of each speech spectral individuals are guaranteed a place in the new generation.
vector while maintaining the pertinent information it This process is repeated for 100 generations.
contains. The mel-scale is a mapping from a linear to a
nonlinear frequency scale based on human auditory The best structure of RNN provided by the proposed
perception. It is proved that such a scale increases model is composed of 16 hidden units. We use the back-
significantly the performance of speech recognition propagation algorithm to train a RNN using this structure.
systems in comparison with the traditional linear scale. We note that, using this network, recognition rates and run
The computation of MFCC requires the selection of M time are greatly improved than those given by the RNN
critical bandpass filters. To obtain the MFCC, a discrete using 30 hidden units (see tables 2 and 3). We conclude
cosine transform, is applied to the output of M filters. that the proposed constructive evolutionary process
These filters are triangular and cover the 156 6844 Hz improves the objective of defining the best structure of a
frequency range; they are spaced on the mel-frequency RNN.
scale. This scale is logarithmic above 1 kHz and linear
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 301
Tables 2 and 3 present a comparison of training rates, recurrent neural networks. This algorithm is able to reach
generalization rates and run time of the studied models. a wider set of possible RNN structures. We have shown
The Elman model using 30 hidden neurons provides the that this model is able to achieve good performance in the
lowest recognition rate and the greater runtime of about 10 recognition of TIMIT vowels, outperforming other studied
hours. GA gives best recognition rates than those given by methods.
the Elman model using 30 hidden units and it requires
The main results are as follows:
only 3 hours 30 minutes.
- The best RNN structure produced by the proposed
Furthermore, we note that the proposed model provides model gives a better recognition rate at a lower
the best training rate of about 58.79% and the best runtime.
generalisation rate of about 58.38%. In addition, it
- The proposed model improves the recognition rate
ameliorates the recognition rate of most of the phonemes
of the TIMIT vowels macro-classes of about 15%
such as /ey/ having 18% rather than 2% and /ay/ having
compared with the Elman model.
39% rather than 8%. We conclude, then, that the proposed
multi-objective constructive model improves the objective We suggest extending the constructive method to determine
of training of RNN. Furthermore, it should be noted that the optimal number of hidden layer and the number of
the proposed model takes 7 hours for training. hidden units in each one.
This is justified by the fact that we use several objectives.
Table 3: Generalization rates of the Elman model usig 30 hidden units, the
Table 2: Training rates of the Elman model usig 30 hidden units, the GA, GA, the Elman model using 16 hidden units and the RNND-MOGA model
the Elman model using 16 hidden units and the RNND-MOGA model Vowels Samples Elman GA Elman RNND-MOGA
Vowels Samples Elman GA Elman RNND-MOGA (30 hidden (16 hidden
(30 hidden (16 hidden units) units)
units) units) iy 522 72.22 83.33 72.41 86.02
iy 1552 77.83 85.5 77.19 84.99 ih 327 8.26 12.23 16.51 34.86
ih 1103 11.6 18.04 17.32 41.52 eh 279 30.83 22.94 24.01 63.44
eh 946 28.43 25.58 28.65 57.19 ey 162 1.24 1.85 0.00 20.99
ey 572 2.27 1.40 0.35 17.83 ae 237 73.1 86.92 73 81.43
ae 1038 77.84 86.71 74.95 84.49 aa 237 62.87 59.49 54.85 74.68
aa 762 71.39 72.57 66.01 80.18 aw 30 0.00 0.00 0.00 0.00
aw 180 0.00 0.56 0.00 5.00 ay 168 2.38 17.86 0.00 41.07
ay 600 7.67 17.33 1 38.83 ah 183 8.74 9.84 21.86 12.02
ah 580 7.07 7.41 23.79 12.41 ao 222 59.91 64.41 54.96 82.88
ao 665 64.36 72.03 62.86 83.16 oy 51 0.00 0.00 0.00 0.00
oy 192 0.00 0.00 0.00 0.00 ow 171 9.94 26.32 35.09 22.81
ow 549 14.39 29.87 41.71 28.05 uh 59 0.00 0.00 0.00 0.00
uh 141 0.00 0.00 0.00 0.00 uw 51 31.37 11.76 23.53 39.22
uw 198 47.98 20.71 50.51 66.67 ux 104 2 2.88 2.88 8.65
ux 400 2.25 1.00 2.00 11.25 er 141 3.55 16.31 4.96 36.88
er 392 8.42 16.58 8.67 37.24 ax 249 50.2 61.85 47.39 67.07
ax 871 38.35 47.19 38.12 57.41 ix 610 67.21 69.18 60.98 81.97
ix 2103 71.85 70.28 66.14 84.31 axr 210 55.72 65.71 46.19 69.05
axr 739 52.23 63.46 54.26 64.68 axh 28 32.14 39.29 39.29 28.57
axh 86 37.21 34.88 62.79 38.37 Global rate 4042 41.28 46.57 40.68 58.38
Global
13966 43.63 47.68 44.29 58.79
rate
Runtime 10h20mn 3h30mn 4h 7h
Acknowledgments
References [22]
E.G. Talbi, Metaheuristiques pour loptimisation combinatoire
multi-objectif : Etat de lart, PM2O1999 (1999).
[1] [23] L. Tcheeko, Un rseau de neurones pour la classification et la
http://www.ldc.upenn.edu/Catalog/readme_files/timit.readme.html
reconnaissance de la parole, Ecole nationale suprieure
[2] P.J. Angeline, G.M. Saunders, and J.B. Pollack, An evolutionary
polytechnique (1994), 277280.
algorithm that constructs recurrent neural networks, IEEE
[24] R. Tlemsani, N.R. Tlemsani, N. Neggaz, and A. Benyettou,
Transactions on Neural Networks (1993).
Amlioration de lapprentissage des rseaux neuronaux par les
[3] N. Arous, Hybridation des cartes de Kohonen par les algorithmes
algorithmes evolutionnaires : application la classification
gntiques pour la classification phonmique, Ph.D. thesis, Thse
phontique, SETIT (2005).
de doctorat,ENIT, 2003.
[25] S. Kazarlis V.Petridis and A. Papaikonomou, A genetic algorithm
[4] H. Azzag, F. Picarougne, C. Guinot, and G. Venturini, Un survol for training recurrent networks, Proceedings of IJCNN .93 (1993),
des algorithmes biomimtiques pour la classification, Revue des 27062709.
nouvelles technologies de linformation (RNTI) (2004), 1324. [26] M. Zhang and V. Ciesielski, Using back propagation algorithm and
[5] D. Beasly and R. Martin, A sequential niche technique for
genetic algorithms to train and refine neural networks for object
multimodel function operation, Conference on evolutionary
detection, Database and expert systems applications. International
computation 1 (1993), 101125. conference No10 1677 (1999), 626635.
[6] P.A. Castillo, J.J. Merelo, M.G. Arenas, and G. Romero, A. Zinflou, Systme interactif daide la dcision bas sur des
Comparing evolutionary hybrid systems for design and [27] algorithmes gntiques pour loptimisation multi-objectifs, Masters
optimization of multilayer perceptron structure along training thesis, UNIVERSIT DU QUEBEC, 2004.
parameters, Information Sciences 177 (2007), 28842905.
[7] R. Chandra, M. Frean and M. Zhang, Building Subcomponents in
the Cooperative Coevolution Framework for Training Recurrent Hanen Chihi received computer science engineering
Neural Networks, School of Engineering and Computer Science, degree from Institut Suprieur dInformatique (ISI), Tunis,
Victoria University of Wellington, Wellington, New Zealand, 2009. Tunisia, the MS degree Software Engineering (Intelligent
[8] M. Chetouani, B. GAS, and J.L. Zarader, Une architecture
modulaire pour lextraction de caractristiques en reconnaissance Imaging Systems and Artificial Vision) from ISI Tunisia.
de phonmes, Intenational conference on information processing She is currently working towards the Ph.D degree,
(ICONIP02) (2002). Tunisia. Her research interests include optimization,
[9] H. Chihi and N. Arous, Adapted evolutionary recurrent neural pattern classification and evolutionary neural networks.
network, JTEA (2010).
[10] D. Dasgupta and D. R. McGregor, Designing application-specific
neural networks using the structured genetic algorithm. Najet Arous received computer science engineering
[11] S. Dehuri and S.-B. Cho, Multi-criterion pareto based particle degree from Ecole Nationale des Sciences dInformatique,
swarm optimized polynomial neural network for classification : A
review and state-of-the-art, Computer Science Review 3 (2009), Tunis, Tunisia, the MS degree in electrical engineering
1940. (signal processing) from Ecole Nationale dIngTenieurs de
[12] M. Delgado and M.C. Pegalajar, A multiobjective genetic algorithm Tunis (ENIT), Tunisia, the Ph.D. degree in electrical
for obtaining the optimal size of a recurrent neural network for engineering (signal processing) from ENIT. She is
grammatical inference, Pattern Recognition 38 (September 2005),
14441456. currently a computer science assisting master in the
[13] N. Garcia and C.J. Hervas, Multi-objective cooperative coevolution computer science department at FSM, Tunisia. Her
of artificial neural networks, Neural Networks 15 (2002), 1259 research interests include scheduling optimization, speech
1278. recognition and evolutionary neural networks.
[14] D.E. Goldberg, Algorithmes gntiques exploration optimisation et
apprentissage automatique, Kluwer Academic Publisher, 19 janvier
1996.
[15] J.R. Koza and J.P. Rice, Genetic generation of both the weight and
architecture for a neural network, Proceedings of the International
Joint Conference on Neural Networks (1991), 397404.
[16] L. Kuncheva and C.J. Whitaker, Measures of diversity in classifier
ensembles and their relationship with the ensemble accuracy,
Machine Learning 51 (2003), 181207 51.
[17] Y. Liu and X. Yao, Ensemble learning via negative correlation,
Neural Networks 12 (1999), 13991404.
[18] Y. Liu, X. Yao, Q. Zhao and T. Higuchi, Evolving a cooperative
population of neural networks by minimizing mutual information, In
Proceedings of the 2001 IEEE Congress on Evolutionary
Computation (2001), 384389.
[19] K. Maneeratana, K. Boonlong and N. Chaiyaratana, Multi-objective
Optimisation by Co-operative Co-evolution, PPSN VIII : parallel
problem solving from nature, 772-781, (2004 ).
[20] R.T. Marler and J.S. Arora, Survey of multi-objective optimization
methods for engineering, Struct Multidisc Optim 26, 369395
(2004).
[21] N. Srinivas and K. Deb, Multi-objective function optimization using
non-dominated sorting genetic algorithms, Evolution. Comput. 2
(1994), 221248.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 303
1
Department of Computer System and Communication, Faculty of Computer Science &
Information Systems, University Teknologi Malaysia, Johor Bahru, Malaysia
Suan Dusit Rajabhat University,Bangkok,Thailand
2
Department of Computer System and Communication, Faculty of Computer Science &
Information Systems, University Teknologi Malaysia, Johor Bahru, Malaysia
3
Department of Computer System and Communication, Faculty of Computer Science &
Information Systems, University Teknologi Malaysia, Johor Bahru, Malaysia
STMIK Indonesia Padang, Padang Indonesia
2
Pondicherry Engineering College
Pondicherry-605014
3
Pondicherry Engineering College
Pondicherry-605014
is more relevant to the user query. The classification of MEAD is the combination of two baseline summarizers:
multi-document summarization is shown in the figure 1. lead-based and random based. Lead-based summaries are
The brief description about each technique is stated below. produced by selecting the first sentence of each document,
then the second sentence of each, etc. until the desired
2.1 Generic Summary Extraction Techniques summary size is met. A random summary consists of
The RANDOM based technique [9] is the simplest enough randomly selected sentences (from the cluster) to
technique, which randomly selects lines from the input produce a summary of the desired size. MEAD is a
source documents. Depending upon the compression rate centriod-based extractive summarizer that scores sentences
i.e. the size of the summary, the randomly selected lines based on sentence-level and inter-sentence features that
will be included to the summary. In this technique, a indicate the quality of the sentence as a summary
random value between 0 and 1 is assigned to each sentence .It then chooses the top-ranked sentences for
sentence of the document. A threshold value for length of inclusion in the output summary. MEAD extractive
the sentence is provided in general. The score of 0 to 1 is summaries score the sentences according to certain
assigned to all sentences that do not meet assigned length sentence features Centriod [9], Position [9], and Length
cut-off. Finally, required sentences are chosen according [9].
to assigned highest score for desired summary. Dragomir R. Radev [1] et al proposed a multi-document
text summarizer, called MEAD. The proposed system
creates the summary based on cluster centroids. Centroid
is the set of words that are most important to the cluster. In
addition to the Centroid, position and first sentence
overlap values are involved in the score calculation. Two
new techniques namely cluster based relative utility and
cross sentence information subsumption were applied to
the evaluation of both single and multiple document
summaries. Cluster base relative utility refers to the degree
of relevance of a particular sentence to the general topic of
the cluster. Summarization evaluation methods used could
be divided into two categories: intrinsic and extrinsic.
Intrinsic evaluation method measures the quality of multi-
document summaries in a direct manner. Extrinsic
evaluation methods measure how sucessfully the
summaries help in performing a particular task. The
extrinsic evaluation in terms called task-based evaluation.
Fig.1 Classification of summarization techniques
The new utility-based technique called CBSU was used
LEAD based technique is one where first or first and last for the evaluation of MEAD and of summarizers in
sentence of the paragraph are chosen depending upon the general. It was found that MEAD produces summaries that
compression rate (CR) and it is suitable for news articles. are similar in quality to the ones produced by humans.
It can be reasonable that n% sentences are chosen from MEADs performance was compared to an alternative
beginning of the text e.g. selecting the first sentence in all method, multi-document lead and showed how MEADs
the document, then the second sentence of each, etc. until sentence scoring weights can be modified to produce
the desired summary is constructed. This method is called summaries significantly better than the alternatives.
LEAD [9] based method for summarization. In this
Afnan Ullah Khan [3] et al proposed a new technique for
technique a score of 1/n to each sentence is assigned,
information summarization, which is the combination of
where n is the sentence number in the corresponding
the rhetorical structure theory and MEAD summarizer. In
document file. This means that the first sentence in each
general MEAD summarizer is totally based on
document will have the same scores; the second sentence
mathematical calculation and lack a knowledge base.
in each document will have the same scores, and so on.
Rhetorical structure theory is used to overcome this
The length value is also provided as a threshold .The
weakness. The new summarizer system is evaluated
sentences with less length than the specified threshold
against the original MEAD summarizer system. The
value are thrown out.
proposed summarizer tool was exploited mainly in two
MEAD is a commonly used technique which can perform areas of information that are Financial Articles and
many different summarization tasks. It can also summarize PubMed abstracts. The experimental results show that
individual documents or clusters of related documents. MEAD produces successful summaries 75% time for both
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 310
short and long documents whereas MRST produces called CPSL and second is the combination of LEAD and
successful summaries for short documents 70% of the time CPSL called LESM. In general LEAD is the
and long documents summaries 65% of the time, as the summarization technique in which first or first and last
size of the document increases the performance of MRST sentence of the paragraph are chosen depending upon the
deteriorates. compression rate (CR). The results of proposed techniques
are compared with conventional methods called MEAD
The two-stage sentence selection approach proposed by
with respect to some evaluation techniques. The results
Zhang Shu [4] et al is based on deleting sentences in a
demonstrate that CPSL shows better performance for short
candidate sentence set to generate summary. The two
summarization than MEAD and for remaining cases it is
stages are (1) acquisition of a candidate sentence set and
almost similar to MEAD and LESM also shows better
(2) the optimum selection of sentence. The candidate
performance for short summarization than MEAD but for
sentence set is obtained by redundancy-based sentence
remaining cases it does not show better performance than
selection approach at the first stage where as in the second
MEAD.
stage, optimum selection of sentences technique is used to
delete sentences in the candidate sentence set according to Shu Gong [11] et al proposed a Subtopic-based Multi-
its contribution to the whole set until desired summary documents Summarization (SubTMS) method. This
length is met. With a test corpus, the ROUGE value method adopts probabilistic topic model to find out the
obtained for the proposed approach proves its validity, subtopic information inside each and every sentence and
compared to the traditional method of sentence selection. uses a hierarchical subtopic structure to explain both the
The influence of the chosen token in the two-stage whole documents collection and all sentences inside it.
sentence selection approach on the quality of the generated here the sentences represented as subtopic vectors, it
summaries is analysed. It differs from the traditional assess the semantic distances of sentences from the
method of adding sentences to create summary by deleting documents collections main subtopics and selects
the sentences in a set of candidate sentences to create the sentences which have short distance as the final summary.
summary. With the test corpus used in DUC 2004, and They have found that, training a topics documents
compared to the redundancy based sentence selection, the collection with some other topics documents collections
experiments show that the two-stage sentence selection as background knowledge, this approach achieves fairly
approach increases the ROUGE value of the summaries, better ROUGH scores compared to other peer systems in
which proves the validity of the proposed approach. the experimental results on DUC2007 dataset.
Dingding Wang [7] et al proposed a summarization system A.Kogilavani [12] et al proposed an approach to cluster
which is mainly based on sentence-level semantic analysis multiple documents by using document clustering
and non-negative matrix factorization. The sentence- approach and to produce cluster wise summary based on
sentence similarity is calculated by using the semantic feature profile oriented sentence extraction strategy. Most
analysis and the similarity matrix is constructed. Then the similar documents are grouped into same cluster using
symmetric matrix factorization process is used to group document clustering algorithm. Feature profile is
the similar documents into clusters. The experimental generated which mainly includes the word weight,
result on DUC2005 and DUC2006 datasets achieves the sentence position, sentence length, and sentence centrality,
higher performance. proper nouns in the sentence and numerical data in the
sentence. Based on this feature profile sentence score is
Ben Hachey [8] proposed a generic relation extraction
calculated for each and every sentence in the cluster of
based summarization system. A GRE system builds the
similar documents. According to different compression
systems for relation identification and characterization
ratio sentences are extracted from each cluster and ranked.
which can be transferred across domains and tasks without
Then the sentences are extracted and included in the
any modification in model parameters. Relation
summary. Extracted sentences are arranged in
identification is the extraction of relation forming entity
chronological order as in input documents and with the
mention pairs whereas relation characterization is the
help of this, cluster wise summary will be generated. An
assignment of types of relation mentions. An experimental
experimental result shows that the proposed clustering
result shows that the proposed systems performance is
algorithm is efficient and feature profile is used to extract
slightly superior when compared to the existing system.
most important sentences from multiple documents. The
Md. Mohsin Ali [9] et al proposed two techniques for both summary generated using the proposed method is
single and multi document text summarization. The first compared with human summary created manually and its
technique is adding a new feature called SimWithFirst performance has been evaluated and the result shows that
(Similarity with First Sentence) with MEAD the machine generated summary coincides with the human
(Combination of Centroid, Position, and Length Features) intuition for the selected dataset of documents.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 311
Automatic
summariza Centriod, Global E- Time, Better
tion of Position and commerce Relia Speedup in
search First sentence Framewor bility reading time,
engine hit overlap k Better
lists Reliability
Symmetric p- documents
bigram
Matrix plus Summariz
unigra
Facorrizati m)
ation-2010
on-2008
an ongoing research process. Redundancy elimination in Mechanical Translation and Computational Linguistics11,
generated summary is also an attractive area of research. 1968, 22-31.
[16] Paice, Chris D. "Another Stemmer.", SIGIRForum 24 (3),
6. REFERENCES 1990, 56-61.
[17] Porter, M. F. "An Algorithm for Suffix Stripping."Program
[1] Dragomir R.Radev, Hongyan, Malgorzata Stys and Danial
14, 1980, 130-137.
Tam, Centriod- based summarization of multiple
[18] Fung B, Wnag K & Ester .M, Hierarchical Document
documents, Information Processing and Management,
Clustering using Frequent itemsets, SIAM International
2004.
Conference on Data Mining, SDM 03.2003. Pp 59-70
[2] D.R.Radev, Weiguo Fan, Automatic summarization of
search engine hit lists , University ofMichigan Business
School.
J.Jayabharathy received her M.Tech in 1999 from Department of
[3] Afnan Ullah Khan, Shahzad khan and Waqar Mahmood, Computer Science and Engineering , Pondicherry University, Puducherry.
MRST:A NewTechnique for Information Summarization She has been working as a Assistant Professor in the Department of
World Academy of Science Engineering and Technology, Computer Science and Engineering, Pondicherry Engineering College,
2005. Puducherry. Currently she is working towards the Ph.D degree in
[4] Zhang Shu ,Zhao Tiejun, Zheng Dequan& Zhao Hua , Two Document clustering. Her areas of interest are Distributed Computing,
stage sentence selection approach for multi-Document Grid Computing, Data Mining and Document Clustering.
summarization, Journal of electronics, Vol.2, No.4, July
Dr. S. Kanmani received her B.E and M.E in Computer Science and
2008.
Engineering from Bharathiyar University and Ph.D in Anna University,
[5] Furu Wei,YanXiang He ,Wenjie Li and Qin Lu, A Query- Chennai. She had been the faculty of Department of Computer Science
Sensitive Graph-Based Sentence Ranking Algorithm for and Engineering, Pondicherry Engineering College from 1992 onwards.
Query-Oriented Multi-Document Summarization, Presently she is working as Professor in the Department of Information
International Symposiums on Information Processing, 2008. Technology, Pondicherry Engineering College. Her research interests are
[6] Xiao-Peng Yang and Xiao-Rong Liu, Personalized Multi- Software Engineering, Software testing, Object oriented system, and Data
Document Summarization in Information Retrieval, Mining. She is the Member of Computer Society of India, ISTE and
Seventh Institute of Engineers, India. She has published about 65 papers in
International Conference on Machine Learning and various International conferences and journals.
Cybernetics, Kunming, 12-15 July2008. Miss Buvana Received her B.Tech(2005) in Computer Science and
[7] Dingding Wang, Tao Li, Shenghou Zhu, Chris Ding, Multi- Engineering from Pondicheery University and Currently Doing her
Document Summarization via Sentence Level Semantic M.tech in Pondicherry Engineering College.
Analysis and Symmetric Matrix Factorization, SIGIR
Singapore, July 20-24, 2008.
[8] Ben Hachey, Multi-Document Summarization Using
Generic Relation Extraction, Proceedings of the Conference
on
Empirical Methods in Natural Language Processing, pages
420-429, 2009.
[9] Md. Mohsin Ali, Monotosh Kumar Ghosh, and Abdullah-Al-
Mamun, Multi- document Text Summarization: SimWithFirst
Based Features and Sentence Co-selection Based Evaluation,
International Conference on Future Computer and
Communication, 2009.
[10] Lei Huang, Yanxiang He, Furu Wei, and Wenjie Li,
Modeling Document Summarization as Multi-objective
Optimization, Third International Symposium on Intelligent
Information Technology and Security Informatics, 2010.
[11] Shu Gong, Youli Qu and Shengfeng Tian, Subtopic-based
Multi-documents Summarization, Third International
Joint Conference on Computational Science and
Optimization, 2010.
[12] A.Kogilavani and Dr.P.Balasubramani, Clustering and
Feature Specific Sentence Extraction Based Summarization
of Multiple Documents, International Journal of computer
science & information Technology (IJCSIT) Vol.2, No.4,
August 2010.
[13] WB Frakes, CJ Fox, Strength and Similarity of Affix
Removal Stemming Algorithms, ACM SIGIR Forum, 2003.
[14] Harman, D. "How Effective is Suffixing." Journal of the
American Society for Information Science 42 (1), 1991, 7-15.
[15] Lovins, J. B. "Development of a Stemming Algorithm,
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 315
3
Department of Information Technology, Balochistan University of I.T. Engineering and Management Science
Quetta, Pakistan
with a different level of English language fluency in rate of Pakistan is (56%). Sindh (58%) and Punjab (58%)
Pakistan. are equally more literate as compared to NWFP (50%) and
The work is organized as follows. Section 2 gives the Balochistan (49%) provinces. The percentage of English
overview and background of the internet users in Pakistan. speakers in the country is only 10.9% [13]. So the users
Section 3 present some related studies. Section 4 provides when try to read the information on the web in English,
the approach of our study. Section 5 and 6 shows they suffer with web readability. We try to solve this
interpretations. Finally the study is concluded and leaves problem by using English alphabets written in local
some an open issue. language (Urdu). Although the Google translation of the
web pages from English to Urdu (national language of
Pakistan) is available, but the main problem with that, it
2. Background and Motivation translate the sentence word by word, which does not make
the Urdu sentence understandable
Plain Language is a writing approach that is effective to Major headings are to be column centered in a bold font
understand information easily the first time readers reads without underline. They need be numbered. "2. Headings
it [14]. and Footnotes" at the top of this paragraph is a major
Pakistan is a multilingual country, it has two official heading.
languages: English and Urdu. Urdu is also the national
language. Additionally, Pakistan has four major provincial
languages Punjabi, Pashto, Sindhi, and Balochi, as well as 3. Related Work
two major regional languages: Saraiki and Kashmiri [10].
Table 1: Pakistani languages
Web becomes more complex with the fast growth of
information distributed through web pages especially that
Languages Percentage of speakers use a fashion-driven graphical design but readability of
Punjabi 44.15 WebPages is not taken into consideration. The readability
Pashto 15.42 is an important criterion for measuring the web
Sindhi 14.10 accessibility especially non-native readers encounter even
Siraiki 10.53 more problems.
Urdu 7.57
Balochi 3.57
Readability crucial presentation attributes that web
Other 4.66 summarization algorithms consider while generating a
query based web summary. Text on the web of a suitable
Internet access has been available in Pakistan since 1990s. level of difficulty for rapid retrieval but appropriate
The country has been following an aggressive IT policy, techniques needs to be work out for locating it.
aimed at enhancing Pakistans drive for economic Readability measurement is widely used in educational
modernization and creating an exportable software field to assist instructors to prepare appropriate materials
industry. There is no doubt that has been helping increase for students. However, traditional readability formulas are
the popularity of the Internet. Table 2 shows the number not fit to attract much attention from both the educational
of users within a country that access the Internet [11]. and commercial fields [1][2][5][6][7][8][9].
Miller and Hsiang Yu [1] propose a new transformation
Table 2: Internet users in Pakistan
method, Jenga Format, to enhance web page readability. A
Year Internet Rank Percent Date of user study on 30 Asian users with moderate English
users Change Information fluency is conducted and the results show that the
2003 1,200,000 47 2000 proposed transformation method improved reading
2004 1,500,000 48 25.00 % 2002
2005 1,500,000 49 0.00 % 2002
comprehension without negatively affecting reading
2006 10,500,000 23 600.00 % 2005 speed. The authors have solved the problems of distraction
2007 10,500,000 24 0.00 % 2005 elimination and content transformation. They have found
2008 17,500,000 17 66.67 % 2007 two important factors, sentence separation and sentence
2009 17,500,000 17 0.00 % 2007 spacing, affecting the reading.
2010 18,500,000 20 5.71 % 2008 Pang Lau and King [2] propose a bilingual readability
assessment scheme for web site in English and Chinese
Where English is also an official language but it is not the languages. The Experimental results show that, for page
most spoken language of Pakistan. Because English is so readability apart from just indicating difficulty, the
widely spoken, it has often been referred to as a world estimated score acts as a good heuristic to figure out pages
language [12]. That is why English is taught as foreign with low textual content such as index and multimedia
language in Pakistan, but still the percentage of English pages.
fluency is low among the people in Pakistan. The Literacy
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 317
In order to understand the effect of content transformation At the end of each passage there are nine questions related
and to analyze the difference and compare the readability to that passage. The result of each group is shown in Table
between English language and local language written in 5 and Figure 1. The time taken in reading both passages of
English alphabet (plain language) we developed two English and a plain language (Local language-Urdu)
websites with four web pages each and conducted a formal written in English alphabets is shown in Figure 2.
user study to investigate the effectiveness of both contents
from end users point of view [Table 3]. Table 5: Correct answers attempt by users with time taken
5. Findings References
[1] C-H. Yu and R. C, Miller, Enhancing web page readability
Based on the results of the study, we can say that: for non-native readers, Proc Users and attention on the web
1. The transformation of the text content enhances CHI 2010, Atlanta, GA, USA. April 10-15. Pp. 2523-2531.
[2] T. P. Lu and I king, Bilingual Web page and site readability
web readability for non native user i.e. whose
assessment, Proc. WWW, 2006, pp. 993-994.
first language is not English. [3] Y. Miyazaki and K. Norizuki, Developing a Computerized
2. The translated version of an English text gives readability estimation program with a Web-searching
better result and the percentage of correct Function to Match Text Difficulty with Individual Learners
answers is more than English passage text. reading ability, . In: Proceedings of WorldCALL 2008,
3. The ratio of correct answers for translated version Fukuoka, Japan, CALICO, 2008, d111.
is very high in Worker, lower-literate user and in [4] Klare, G. R. A second look at the validity of readability
undergraduate students category because they formulas. Journal of reading Behavior, 1976. pp 129-152.
have only basic and moderate knowledge of [5] M. Gradiar, I. Humar, and T. Turk, Factors Affecting the
English. In Student category there is slight Readability of Colored Text in Computer Displays, Proc.
28th international conference on information technology
difference while for professionals who have good
interfaces, 2006, pp 245 250.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 319
2
Dept of CSE, National Institute of Technology,
Warangal, A.P, India. 506004.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 321
code. The genetic code is a set of sequences, which define process of sequence alignment allows the insertion,
what proteins to build within the organism. Since deletion and replacements of symbols that representing
organisms must replicate and reproduce tissue for the nucleotides or amino acids sequences. From the
continued life, there must be some means of encoding the biological point of view pattern comparison is motivated
unique genetic code for the proteins used in making that by the fact that all living organisms are related by
tissue. The genetic code is information, which will be evolution. That implies that the genes of species that are
needed for biological growth and reproductive inheritance. closer to each other should show signs of similarities at
the DNA level. Moreover, those similarities also extend
DNA is the basic blue print of life and it can be viewed as a
long sequence over the four alphabets A, C, G and T. DNA
to gene function. Normally, when a new DNA or protein
contains genetic instructions of an organism. It is mainly sequence is determined, it would be compared to all
composed of nucleotides of four types. Adenine (A), known sequences in the annotated databases such as
Cytosine (C), Guanine (G), and Thymine (T). The amount GenBank, SwissProt and EMBL.
of DNA extracted from the organism is increasing
exponentially. So pattern matching techniques plays a vital Let P = {p1, p2, p3,...,pm} be a set of patterns of m characters
role in various applications in computational biology for and T={t=t1,t2,t3tn} in a text of n characters which are
data analysis related to protein and gene in structural as strings of nucleotide sequence characters from a fixed
well as the functional analysis. It focuses on finding the alphabet set called = {A, C, G, T}. Let T be a large text
particular pattern in a given DNA sequence. The biologists consisting of characters in . In other words T is an
often queries new discoveries against a collection of element of *. The problem is to find all the occurrences of
sequence databases such as GENBANK, EMBL and DDBJ pattern P in text T. It is an important application widely
to find the similarity sequences. As the size of the data used in data filtering to find selected patterns, in security
grows it becomes more difficult for users to retrieve applications, and is also used for DNA searching. Many
necessary information from the sequences. Hence more existing pattern matching algorithms are reviewed and
efficient and robust methods are needed for fast pattern classified in two categories.
matching techniques. It is one of the most important areas
which have been studied in computer science. The string Exact string matching algorithm
matching can be described as: given a specific strings P Inexact/approximate string matching algorithms
generally called pattern searching in a large sequence/text T
to locate P in T. if P is in T, the matching is found and Exact pattern matching algorithm will find that whether the
indicates the position of P in T, else pattern does not occurs probability will lead to either successful or unsuccessful
in the given text. Pattern matching techniques has two search. The problem can be stated as: Given a pattern p of
categories and is generally divides into multiple pattern length m and a string/Text T of length n (m n). Find all
matching and single pattern matching algorithms. the occurrences of p in T. The matching needs to be exact,
which means that the exact word or pattern is found. Some
Single pattern matching exact matching algorithms are Nave Brute force algorithm,
Multiple pattern matching techniques Boyer-Moore algorithm [3], KMP Algorithm [7].
In a standard problem, we are required to find all Inexact/Approximate pattern matching is sometimes
referred as approximate pattern matching or matches with k
occurrences of the pattern in the given input text, known as
mismatches/ differences. This problem in general can be
single pattern matching. Suppose, if more than one pattern stated as: Given a pattern P of length m and string/text T of
are matched against the given input text simultaneously, length n. (m n). Find all the occurrences of sub string X
then it is known as, multiple pattern matching. Whereas in T that are similar to P, allowing a limited number, say k
single pattern matching algorithm is widely used in different characters in similar matches. The
network security environments. In network security the Edit/transformation operations are insertion, deletion and
pattern is a string indicating a network intrusion, attack, substitution. Inexact/Approximate string matching
virus, and snort, spam or dirty network information, etc. algorithms are classified into: Dynamic programming
Multiple pattern matching can search multiple patterns in a approach, Automata approach, Bit-parallelism approach,
text at the same time. It has a high performance and good Filtering and Automation Algorithms. Inexact sequence
practicability, and is more useful than the single pattern data arises in various fields and applications such as
matching algorithms. To determine the function of specific computational biology, signal processing and text
genes, scientists have learned to read the sequence of processing. Pattern matching algorithms have two main
nucleotides comprising a DNA sequence in a process called objectives.
DNA sequencing. Comparison, pattern recognition,
detecting similarity and phylogenetic trees constructing Reduce the number of character comparisons required
in the worst and average case analysis.
in genome sequences are the most popular tasks. The
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 322
Reducing the time requirement in the worst and fields such as diagnostic or forensic research. Because
average case analysis. DNA is key to all living organisms, knowledge of the DNA
sequence may be useful in almost any biological subject
In many cases most of the algorithm operates in two stages. area. For example, in medicine it can be used to identify,
Depending upon the algorithm some of the algorithm uses diagnose and potentially develop treatments for genetic
pre-processing phase and some algorithm will search diseases. Similarly, genetic research into plant or animal
without it. Many Pattern matching algorithms are available pathogens may lead to treatments of various diseases
with their own merits and demerits based upon the pattern caused by these pathogens.
length and the technique they use. Some pattern matching
algorithm concentrates on pattern itself. Other algorithm When we know a particular sequence is the cause for a
compare the corresponding characters of the patterns and disease, the trace of the sequence in the DNA and the
text from the left to right and some other perform the number of occurrences of the sequence defines the intensity
character from the right to left. The performance of the of the disease. As the DNA is a large database we need to
algorithm can be measured based upon the specific order go for efficient algorithms to find out a particular sequence
they are compared. Pattern matching algorithms has two in the given DNA. We have to find the number of
different phases. repetitions and the start index and end index of the
sequence, which can be used for the diagnosis of the
Pre-processing phase or study of the pattern. disease and also the intensity of the disease by counting the
Processing phase or searching phase. number of pattern matching strings, occurred in a gene
database.
The pre-processing phase collects the full information and
is used to optimize the number of comparisons. Whereas Since children inherit their genes from their parents, they
searching phase finds the pattern by the information can also inherit any genetic defects. Children and siblings
collected in pre-processing. of a patient generally have a 50% chance of also being
affected with the same disease. Genetic testing can identify
Bioinformatics has found its applications in many areas. It those family members who carry the familial unusual
helps in providing practical tools to explore proteins and mutation and should undergo annual tumor screening from
DNA in number of other ways. Bio-computing is useful in an early age. Genetic testing can also identify family
recognition techniques to detect similarity between members who do not carry the familial unusual mutation
sequences and hence to interrelate structures and functions. and do not need to undergo the increased tumor
Another important application of bioinformatics is the surveillance recommended for patients with unusual
direct prediction of protein 3-Dimensional structure from mutations. The unusual pattern in the strand reflects in the
the linear amino acid sequence. It also simplifies the split strand and hence increases in the unusual mutations
problem of understanding complex genomes by analyzing increase in the cells. All familial cancer syndromes are
simple organisms and then applying the same principles to caused by a defect in a gene that is important for preventing
more complicated ones. This would result in identifying development of certain tumors. Everybody carries two
potential drug targets by checking homologies of essential copies of this gene in each cell, and tumor development
microbial proteins. Bioinformatics is useful in designing only occurs if both gene copies become defective in certain
drugs. Pattern matching in biology differs from its susceptible cells. Genetic testing can help to diagnose by
counterpart in computer science. DNA strings contain detecting defects in the unusual mutated gene.
millions of symbols, and the pattern itself may not be
The rest of the paper is organized as follows. We briefly
exactly known, because it may involve inserted, deleted, or
present the background and related work in section 2.
replacement of the symbols. Regular expressions are useful
Section 3 deals with proposed model i.e., 2-JUMP DNA
for specifying a multitude of patterns and are ubiquitous in search multiple pattern matching algorithm. Experimental
bioinformatics. However, what biologists really need is to results and discussion are presented in Section 4 and we
be able to infer these regular expressions from typical make some concluding remarks in Section 5.
sequences and establish the likelihood of the patterns being
detected in new sequences. 2. Background and Related Work
The sequence of DNA constitutes the heritable genetic This section reviews some work related to DNA sequences.
information in nuclei, plasmids, mitochondria, and An alphabet set = {A, C, G, T} is the set of characters for
chloroplasts that forms the basis for the developmental DNA sequence which are used in this algorithm.
programs of all living organisms. Determining the DNA The following notations are used in this paper:
sequence is therefore useful in basic research studying DNA sequence characters = {A, C, G, T}.
fundamental biological processes, as well as in applied Denotes the empty string.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 323
P Denotes the length of the string P. requirements of almost O(mn). The idea was to build only
S[n] Denotes that a text which is a string of length n. the states and transitions which are actually reached in the
P[m] Denotes a pattern of length m. processing of the text. The automaton starts at just one state
CPC-Character per comparison ratio. and transitions are built as they are needed. The transitions
those were not necessary will not be build.
String matching mainly deals with problem of finding all
occurrences of a string in a given text. In most of the DNA The Deviki-Paul algorithm [5] for multiple pattern
applications it is necessary for the user and the developer to matching requires a preprocessing of the given input text to
be able to locate the occurrences of specific pattern in a prepare a table of the occurrences of the 256 member
sequence. In Brute-force algorithm the first character of the ASCII character set. This table is used to find the
pattern P is compared with the first character of the string probability of having a match of the pattern in the given
T. If it matches, then pattern P and string T are matched input text, which reduces the number of comparisons,
character by character until a mismatch is found or the end improving the performance of the pattern matching
of the pattern P is detected. If mismatch is found, the algorithm. The probability of having a match of the pattern
pattern P is shifted one character to the right and the in the given text is mathematically proved.
process continues. The complexity of this algorithm is
O(mn). The Bayer-Moore algorithm [3] applies larger shift- In the MSMPMA [18] technique the algorithm scans the
increment for each mismatch detection. The main input file to find the all occurrences of the pattern based
difference the Nave algorithm had is the matching of upon the skip technique. By using this index as the starting
pattern P in string T is done from right to left i.e., after point of matching, it compares the file contents from the
aligning P and string T the last character of P will matched defined point with the pattern contents, and finds the skip
to the first of T . If a mismatch is detected, say C in T is not value depending upon the match numbers (ranges from 1 to
in P then P is shifted right so that C is aligned with the m-1). Harspool [6] does not use the good suffix function,
right most occurrence of C in P. The worst case complexity instead it uses the bad character shift with right most
of this algorithm is O(m+n) and the average case character .The time complexity of the algorithm is O(mn).
complexity is O(n/m).
Berry-Ravindran [2] calculates the shift value based on the
In IFBMPMA [12] the elements in the given patterns are bad character shift for two consecutive text characters in
matched one by one in the forward and backward until a the text immediately to the right of the window. This will
mismatch occurs or a complete pattern matches .The KMP reduce the number of comparisons in the searching phase.
algorithm [7] is based on the finite state machine The time complexity of the algorithm is O(nm) .Sunday [4]
automation. The pattern P is pre-processed to create a finite designed an algorithm quick search which scans the
state machine M that accepts the transition. The finite state character of the window in any order and computes its shift
machine is usually represented as the transition table. The with the occurrence shift of the character T immediately
complexity of the algorithm for the average and the worst after the right end of the window. The FC-RJ [11]
case performance is O(m+n). algorithm searches the whole text string for the first
character of the pattern and maintains an occurrence list by
In IBKMPM [13] algorithm we first choose the value of k storing the index of the corresponding character. Time and
(a fixed value), and divide both the string and pattern into space complexity of preprocessing is O(n). FC_RJ uses an
number of substring of length k, each substring is called as array equal to size of the text string for maintaining
a partition. If k value is 3 we call it as 3-partition else if it is occurrence list.
4 then it is 4-partition algorithm. We compare all the first
characters of all the partitions, if all the characters are Ukkonen [15] proposed automation method for finding
matching while we are searching then we go for the second approximate patterns in strings. He proposed the idea using
character match and the process continues till the mismatch a DFA for solving the inexact matching problem. Though
occurs or total pattern is matched with the sequence. If all automata approach doesnt offer time advantage over
the characters match then the pattern occurs in the sequence Boyer-Moore algorithm [3] for exact pattern matching.
and prints the starting index of the pattern or if any The complexity of this algorithm in worst and average case
character mismatches then we will stop searching and then is O(m+n). In this every row denotes number of errors and
go to the next index stored in the index table of the same column represents matching a pattern prefix. Deterministic
row which corresponds to the first character of the pattern automata approach exhibits O(n) worst case time
P. complexity. The main difficulty with this approach is
construction of the DFA from NFA which takes
In approximate pattern matching method the oldest and
exponential time and space. Wu.S.Manber.U [16] proposed
most commonly used approach is dynamic programming.
the algorithm for fast text searching allowing errors. The
In 1996 Kurtz [8] proposed another way to reduce the space
first bit-parallel method is known as shift-or which
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 324
searches a pattern in a text by parallelizing operation of non In this method we use combination of both the techniques
deterministic finite automation. This automation has m+1 Index Based Search
states and can be simulated in its non deterministic form in ASCII sum
O(mn) time. The filtering approach was started in 1990. The index based search has been well established. Here we
This approach is based upon the fact it may be much easier created index table of the input data and our search skips
to tell that a text position doesnt match. It is used to primarily on the index-row of the first character of the
discard large areas of text that cannot contain a match. The pattern. However in our proposed work, we go one step
advantage in this approach is the potential for algorithms ahead and rather than using primitive method of comparing
that do not inspect all text characters. single character at a time, we rather compare sum of two
characters of both input sequence data and pattern. This
By using dynamic programming approach especially in reduces our comparisons by one-third (we count one
DNA sequencing Needleman-Wunsch [9] algorithm and comparison for sum). After we match it completely we go
Smith-waterman algorithms [14] are more complex in for order checking in the subgroups sequentially until there
finding exact pattern matching algorithm. By this method is a mismatch or it completely matches.
the worst case complexity is O(mn). The major advantage
of this method is flexibility in adapting to different edit 3.1. Algorithm
distance functions. The Raita algorithm [10] utilizes the
same approach as Horspool algorithm[6] to obtaining the Input[n] : Input character array of length n.
shift value after an attempt. Instead of comparing each Patt[m] : Pattern character array of length m.
character in the pattern with the sliding window from right IndexTable[4][n] index Table of input of length 4*n (ACGT)
to left, the order of comparison in Raita algorithm [10] is Let i,j,startIndex,flag,compare,counter integer variables
carried out by first comparing the rightmost and leftmost i=j=start Index=compare=counter=0.
Flag=1
characters of the pattern with the sliding window. If they 1. Create the index table.
both match, the remaining characters are compared from 2. Fetch startIndex as per first letter of pattern.
the right to the left. Intuitively, the initial resemblance can startIndex = IndexTable [firstLet][i];
be established by comparing the last and the first characters 3. while(n-startIndex > m)
of the pattern and the sliding window. Therefore, it is while(j<m)
anticipated to further decrease the unnecessary if(m-j==1) // odd no. of characters in pattern.
comparisons. if(input[startIndex+j] != pat[j])
compare++;
The Aho-Corasick algorithm[1] developed at Bell Labs in flag=0;
break;
1975 by Alfred Aho and Corasick is an extension of the Inp2 =input[startIndex+j]+input[startIndex+j+1];
KMP algorithm [7]. The AC algorithm consists of Pat2 = pat[j]+pat[j+1];
constructing a finite state pattern matching machine from Compare++;
the keyword and then using the machine to process the text If(inp2!=pat2)
in a single pass. It can find an occurrence of several Flag=0;
patterns in the order of O(n) time, where n is the length of Break;
the text, with pre-processing of the patterns in linear time. Else
compare++;
Two dimensional pattern matching methods are commonly If(input[startIndex+j] != pat[j]|| input[startIndex+j+1] != pat[j+1])
flag=0;
used in computer graphics. Takaoka and Zhu proposed break;
using a combination of the KMP[6] and RK methods in an If(flag == 1)
algorithm developed for two dimensional cases. The second Counter++;
approach that runs faster when the row length of the pattern Else
increases and is significantly faster than previous methods Flag=1;
proposed. Three dimensional pattern matching is useful in J=0;
solving protein structures, retinal scans, finger printing, StartIndex = IndexTable[firstLet][++i];
music, OCR and continuous speech. Multi-dimensional
matching algorithms are a natural progression of string 3.2. Index Based Search
matching algorithms toward multi-dimensional matching
patterns including tree structure, graphs, pictures, and This method has been invented and used to reduce the
proteins structures. search time drastically. In this method we make an Index
table of given input on the basis of characters involved
3. 2-JUMP DNA Se arch Multiple Pattern which in our case are A,C,G,T. So, we have a (4xSize of
input) table. Now we concentrate only on the index row of
Matching Algorithm first character of our pattern and continue our comparison
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 325
technique from the first index onwards. Based upon our E.g. input- AT
comparisons results of success or failure we can directly Pattern- TA
jump to next potential occurrence of pattern by moving to
the next index in the row chosen. We continue above But, such comparison will be required only if pattern
operations till we finish all indexes of that row. In this way matches. Thus over all we find following result: Say,
we need not move serially through the input, but rather we comparisons found over pattern lengths in general are n.
only concentrate only on the potential strings. By our methods we reduce them to halves i.e., n/2.
Further adding the single comparisons if our pattern
3.3 ASCII SUM (or 2-Jump) matched: n/2 + p. Where p is length of pattern, which is
generally quite small. Thus taking p->0. We get total
Our unique comparison method adds further benefits to our number of comparisons is n/2. The conversion of input can
Index Based Search. Here we use unique property of be done on the fly or while creation of index table.
characters involved in our search patterns and input. As we 3.4. Trivial Cases in Comparisons
are dealing with only genetic data, so our domain confines
to following four characters A, C, G, T. Further reducing Case i: If S = i.e., |S| = 0 and P = i.e., |P| = 0 then the
these characters to single digits by mod formula. number of occurrences of P in S is 0.
Case ii: If S = i.e. |S| = 0 and for any |P| 0 then the
Table.1. Subscript values of DNA sequence characters number of occurrences of P in S is 0.
S.No DNA ASCII ASCII (ASCIIValue- Array Case iii: If S i.e., |S| 0 and for any |P| = 0 then the
Value Value-64 64)%5 Subscript number of occurrences of P in S is 0.
1 A 65 1 1 1 Case iv: If S i.e., |S| 0, P i.e., |P| 0 and |S| |P|
2 C 67 3 3 3 then the number of occurrences of P in S is 0.
3 G 71 7 2 2
4 T 84 20 0 0 3.4. To understand the algorithm assume a string
S=AGAATGCAGCTACAAGGTTCCATTCTGTCTCGCACTA of
Now we can use unique property of above integers. Any 37 characters and pattern P= ATGCAG. Therefore the string
sum of above in combination of two gives a unique number can be viewed as follows in an indexing table.
in return.
Table.2. Index values of A,C,G and T sequence characters
A+A~1+1=2
A+T~1+0=1 T0 5 11 18 19 23 24 26 28 30 36
A+G~1+2=3
A1 1 3 4 8 12 14 15 22 34 37
A+C~1+3=4
G2 2 6 9 16 17 27 32
And so on for other integers too. Now we can use this to
reduce our both input size and patterns to half the length C3 7 10 13 20 21 25 29 31 33 35
they actually are, i.e., we combine two neighboring
alphabets (or their reduced integers) to give single integers. As A being our first character of pattern the target indexes
are 1, 3, 4, 8, 12, 14, 15, 22, 34 and 37.
E.g. Sequence=ATTGCCATA Here S2 and P2 refer to combination of two characters of
Equivalent integers: 100233101 input string and pattern respectively. S and P refer to
Pattern-GCCA whole input and pattern s1. First we begin at index 1
Equivalent integers: 2 3 3 1 because A is starting from index 1. We then form 2-
groups of input and pattern both.
Here the first character of pattern is G. From our sequence i.e., S2 = A+G
we find that first index of character G is at 4. So we start P2 = A+T
forming groups from 4th index onwards. 2-Sum groups Clearly S2!= P2 therefore S!= P. So we skip and go to
starting at G of sequence: (2+3), (3+1) = 5,4. next index.
2-Sum groups of pattern: (2+3), (3+1) = 5,4.
2. At index 3 we get another probable match. We form 2-
So, now rather than comparing each character/integer groups of input and pattern both.
separately we can compare two of them in one go. If in one i.e., S2 = A+A
go we find that our pattern string matches a substring of the P2 = A+T
input, and then we can go further and compare the two Again we find S2!=P2, so we can match directly from next
characters. This will be necessary as the two characters may index.
exist in reverse order form as compared to that of pattern.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 326
4. Next we move to index 4. Here, Proof: Let N=Input String say, ATTTGACCTTGAAA...
S2 = A+T
P2 = A+T By converting the string to equivalent numerical sequence
So we get S2=P2, we move further to next subgroup, using formula,
S2 = G+C
P2 = G+C N[i] = (N[i] 64) % 5, i = Length of Input.
As S2=P2 we proceed further,
S2=A+G Now we apply same to Pattern P,
P2=A+G,
As all subgroups have matched we go for checking order in P[i] = (P[i] 64) % 5, i = Length of Pattern.
our subgroups. In case of first subgroup, we find character
in same order as pattern, so we go for next subgroup. Here First we prepare P,
also characters are in same order as per pattern. Same
follows up to the last subgroup .So we do three more P[j] = P[i] + P[i+1]
comparisons and over all in 6 comparisons we are getting
our pattern matched. j++, i+2
Thus S=P. We now proceed to next index.
Where P is another array of length half that of P.
5. Next we move to index 8. Here,
S2 = A+G Now we process N,
P2 = A+T
Clearly S2!=P2. Thus we conclude S!=P and move to 2Sum = N[i]+N[i+1], where i<length of P
further index.
Compare (P[j],2Sum)
6. However at 12, we find
S2 = A+C Where Compare function compares the two quantities and
P2 = A+T.
breaks the whole operation if it find mismatch.
Here too we find S2!=P2 giving us S!=P. We check for
next index now.
Thus we see effectively maximum number of comparisons
7. At index 14, require.
S2 = A+A
P2 = A+T Max (length of(P), (length of(N))/2);
So S2!=P2. Without further checking we skip to next
index. in case of even comparisons and
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 327
4.1. The below DNA sequence dataset has been taken measurement factor, this factor affects the complexity time,
for the testing of 2-jump algorithm .The DNA and when it is decreased the complicity also decreases.
biological sequence S*of size n=1024 and pattern Table .3.Experimental results analysis of 2-jump algorithm
P*. Let S be the following DNA sequence.
No. of 2-
AGAACGCAGAGACAAGGTTCTCATTGTGTCTCGC S.No Pattern Patten CPC
Occur jump
AATAGTGTTACCAACTCGGGTGCCTATTGGCCTCC Length
1 A 1 259 259 0.2
AAAAAAGGCTGTTCAACGCTCCAAGCTCGTGACCT
2 AG 2 53 312 0.3
CGTCACTACGACGGCGAGTAAGAACGCCGAGAAG
3 CAT 3 11 335 0.3
GTAAGGGAACTAATGACGCGTGGTGAATCCTATG 4 AACG 4 5 434 0.4
GGTTAGGATCGTGTCTACCCCAAATTCTTAATAAA 5 AAGAA 5 2 441 0.4
AAACCTAGGACCCCCTTCGACCTAGACTATCGTAT 6 AAAAAA 6 3 456 0.4
TATGGACAAGCTTTAACTGTCGTACTGTGGAGGCT 7 AGAACGC 7 2 379 0.3
TCAAAACGGAGGGACCAAAAAATTTGCTTCTAGC 8 AAAAAAGG 8 1 460 0.4
GTCAATGAAAAGAAGTCGGGTGTATGCCCCAATTC 9 GCTCATTAG 9 1 390 0.3
CTTGCTGCCCGGACGGCCAGTTCATAATGGGACAC 10 CCTTTTCCGG 10 1 377 0.3
AACGAATCGCGGCCGGATATCACATCTGCTCCTGT 11 TTTTGCCGTGT 11 1 431 0.4
GATGGAATTGCTGAATGCGCAGGTGTGCTTATGTA 12 TTCTTAATAAAA 12 1 435 0.4
CAATCCACGCGGTACTACATCTTGTCTCTTATGTA 13 GGGACCAAAAAAT 13 1 392 0.3
GGGTTCAGTTCTTCGCGCAATCATAGCGGTACGAA 14 TTTTGCCGTGTTGA 14 1 432 0.4
TACTGCGGCTCCATTCGTTTTGCCGTGTTGATCGG 15 CCTCCAAAAAAGGCT 15 1 382 0.3
GAATGCACCTCGGGGACTGTTCGATACGACCTGGG 16 GGCTGTTCAACGCTCC 16 1 392 0.3
ATTTGGCTATACTCCATTCCTCGCGAGTTTTCGATT 17 TTTTCGATTGCTCATTA 17 1 432 0.4
18 GGGATTTGGCTATACTCC 18 1 395 0.3
GCTCATTAGGCTTTGCGGTAAGTAAGTTCTGGCCA
19 GGCCTTGTCTAAAGGTATG 19 1 393 0.3
CCCACTTCGAGAAGTGAATGGCTGGCTCCTGAGCG
20 CCTGAGCGCGTCCTCCGTCA 20 1 382 0.3
CGTCCTCCGTACAATGAAGACCGGTCTCGCGCTAA
ATTTCCCCCAGCTTGTACAATAGTCCAGTTTATTAT
From the below Table.4. results analysis it has been
CAAAGATGCGACAAATAAATTGATCAGCATAATC
observed the following in terms of relative performance of
GAAGATTGCGGAGCATAAGTTTGGAAAACTGGGA
our algorithm with some of existing algorithms. To
GGTTGCCAGAAAACTCCGCGCCTACTTTCGTCAGG
measure the performance of the proposed algorithm with
ATGATTAAGAGTATCGAGGCCCCGCCGTCAATACC
the existing popular algorithm we have used two
GATGTTCTTCGAGCGAATAAGTACTGCTATTTTGC
parameters like CPC (Character per comparison ratio) and
AGACCCTTTGCCAGGCCTTGTCTAAAGGTATGTTA
number of comparisons which are shown in Table.4. The
CTTAATATTGACAATACATGCGTATGGCCTTTTCC
proposed algorithm gives good performance with the
GGTTAACTCCCTG.
algorithms like MSMPMA, Brute-force, Tri-Match,
IKPMPM and Nave string matching algorithms. From the
The index table (index Tab[4][1024]) for sequence S is
Table.4. We have taken different pattern sizes from 1 to 16
very large in number of DNA sequence characters . For
and analyzed accordingly. In all the different cases the
different patterns sizes which has been chosen randomly
proposed technique gives better performance with existing
from the above DNA sequence the number of occurrences
algorithms.
and the number of comparisons is shown in the Table. 3. To
check whether the given pattern is present in the sequence Table .4.Comparisons of different algorithms with 2-jump
or not we need an efficient algorithm with less comparison
Brute- Tri- Nave
time and complexity. By the current technique different 2-JUMP IBKPMPM MSMPMA
Force Match String
patterns are analyzed and the graph is plotted by using Pattern No.of No.of No.of No.of No.of No.of
CPC CPC CPC CPC CPC CPC
Com Com Com Com Com Com
these results and analyzed accordingly. From the below
A 259 0.2 259 0.2 1024 1.0 1024 1.0 1025 1.0 1024 1.0
experimental results, improvement can be seen that 2-
AG 312 0.3 518 0.5 1230 1.2 1282 1.2 1284 1.2 1281 1.2
JUMP algorithm gives good performance compared to the
CAT 335 0.3 542 0.5 1298 1.2 1318 1.2 1321 1.2 1310 1.2
some of the popular methods shown in the Table.4. Here
AACG 434 0.4 614 0.6 1359 1.3 1376 1.3 1380 1.3 1376 1.3
we have taken five fields in the Table .3. The pattern text,
441 0.4 607 0.5 1375 1.3 1388 1.3 1393 1.3 1387 1.3
number of characters in the pattern, number of occurrences AAGAA
623
of a pattern, the proposed method and the number of AAAAAAGG 460 0.4 0.6 1394 1.3 1409 1.3 1417 1.3 1407 1.3
comparisons and comparisons per character. The number of TTCTTAATAAAA 435 0.4 634 0.6 1390 1.3 1390 1.3 1402 1.3 1399 1.3
comparisons per character (CPC ratio) which is equal to GGCTGTTCAACGCTCC 392 0.3 580 0.5 1349 1.3 1349 1.3 1365 1.3 1349 1.3
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 328
Fig.1. Shows comparison of different algorithms with 2- Construction of phylogenetic trees for studying
JUMP.The proposed algorithm outperforms when evolutionary relationship.
compared with some of the popular algorithms. The current DNA and RNA structure prediction.
technique gives good performance in reducing the number Protein structure prediction and classification.
of comparisons compared with other algorithms. The dotted Molecular design.
line shows the 2-jump proposed model where as Organize data and allow researchers to access existing
MSMPMA, Brute-Force, Trie-matching IKPMPM and information and submit new entries.
Nave string searching are shown by solid lines. From the Develop tools and resources which are used for
below graph towards the X-axis we have the pattern size analysis and management of biological data.
whereas towards Y-axis shows the number of comparisons.
Use sequence data to analyze and interpret the results
If we see the experimental analysis all the other algorithms
in a biologically meaningful manner.
will gives more than 1000 comparisons where as the
To help researchers in the pharmaceutical industry in
proposed technique gives less than 500 comparisons due to
drug design process.
the indexed method.
Finding similarities among strings such as proteins of
1600 different organisms.
1400 Finding similarities among parts of spatial structures.
1200 Constructing of phylogenetic trees called the evolution
of organisms.
1000 2-JUMP MSMPMA
TRI-MATCH BRUTEFORCE Classifying new data according to previously clustered
800 NAVE STRING IBKPMPM sets of annotated data.
600
400 5. Conclusion
200 In this paper we have proposed a new algorithm for DNA
0 pattern matching called 2-jump index based search for
1 2 3 4 5 8 12 16 DNA pattern matching. The proposed technique enhances
the comparison time and the CPC ratio when compared
with some of the popular techniques. The proposed
Fig.1. Comparison of different algorithms with 2-JUMP.
algorithm is implemented, analyzed, tested and compared.
The experimental result shows that there is a large amount
The following are observed from the experimental results. of performance improvement due to this the overall
Reduction in number of comparisons. performance increases.
The ratio of comparisons per character has gradually References
reduced and is less than 1.
Suitable for unlimited size of the input file. [1] Aho, A. V., and M. J. Corasick, Efficient string matching:
Once the indexes are created for input sequence we an aid to bibliographic Search, Communications of the
need not create them again. ACM 18 (June 1975), pp. 333 340.
For each pattern we start our algorithm from the [2] Berry, T. and S. Ravindran, 1999. A fast string
matching character of the pattern which decreases the matching algorithm and experimental results. In:
unnecessary comparisons of other characters.
Proceedings of the Prague Stringology Club Workshop
It gives good performance for DNA related sequence
applications. 99, Liverpool John Moores University, pp: 16-28.
[3] Boyer R. S., and J. S. Moore, A fast string searching
Applications in Bioinformatics algorithmCommunications of the ACM 20, 762- 772, 1977.
[4] D.M. Sunday, A very fast substring search algorithm, Comm.
Different biological problems of bioinformatics involve the ACM 33 (8) (1990) 132142.
study of genes, proteins, nucleic acid structure prediction, [5] Devaki-Paul, Novel Devaki-Paul Algorithm for Multiple
and molecular design. Pattern Matching International Journal of Computer
Alignment and comparison of DNA, RNA, and protein Applications (0975 8887) Vol 13 No.3, January 2011.
sequences. [6] Horspool, R.N., 1980. Practical fast searching in strings.
Gene mapping on chromosomes. Software practice experience, 10:501-506
Gene finding and promoter identification from DNA [7] Knuth D., Morris. J Pratt. V Fast pattern matching in strings,
sequences. SIAM Journal on Computing, Vol 6(1), 323-350, 1977.
Interpretation of gene expression and micro-array data.
Gene regulatory network identification.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 329
[8] Kurtz. S, Approximate string searching under weighted edit [13] Raju Bhukya, DVLN Somayajulu,An Index Based K-Partition
distance. In proceedings of the 3rd South American workshop Multiple Pattern Matching Algorithm, Proc. of International
on string processing. Carleton Univ Press, pp. 156-170, 1996 Conference on Advances in Computer Science 2010 pp 83-87.
[9] Needleman, S.B Wunsch, C.D(1970). A general method [14] Smith,T.F and waterman, M (1981). Identification of
applicable to the search for similarities in the amino acid common molecular subsequences T.mol.Biol.147,195-197.
sequence of two proteins. J.Mol.Biol.48,443-453. [15] Ukkonen,E., Finding approximate patterns in strings J.Algor.
[10] Raita, T. Tuning the Boyer-Moore-Horspool string-searching 6, 1985, 132-137.
algorithm. Software - Practice Experience 1992, 22(10), 879- [16] Wu S., and U. Manber, Agrep A Fast Approximate
884. Pattern-Matching Tool, Usenix Winter 1992 Technical
[11] Rami H. Mansi, and Jehad Q. Odeh, "On Improving the Conference, San Francisco (January 1992), pp. 153 162.
Naive String Matching Algorithm," Asian Journal of [17] Wu.S.,Manber U., and Myers,E .1996, A sub-quadratic
Information Technology, Vol.8, No. I, ISS N 1682- algorithm for approximate limited expression matching.
3915,2009, pp. 14-23. Algorithmica 15,1,50-67, Computer Science Dept, University
[12] Raju Bhukya, DVLN Somayajulu,An Index Based Forward of Arizona,1992.
backward Multiple Pattern Matching Algorithm, World [18] Ziad A.A Alqadi, Musbah Aqel & Ibrahiem M.M.EI Emary,
Academy of Science and Technology..June 2010, pp347-355 Multiple Skip Multiple Pattern Matching algorithms. IAENG
International Vol 34(2),2007.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 330
2
Department of Computer Applications, Integral University
Lucknow, Utter Pradesh, 226026 India
3
Department of Computer Science and Engineering, Integral University
Lucknow, Utter Pradesh, 226026 India
The root of the tree is the special node having no data but
it has pointers to its children and it parent field is set to
NULL. The auxiliary function to create root node is
called with nC as Max and I as 0.
values and we can it imitate the users action and choices equals ncn-i, which is 2(n-1) -1. For this we define node
if follow a particular part in the combination tree. It gives structure PPNode and addParentPointer auxiliary
complete listing of action that users can do. The testers functions to add nodes in the list and
can follow a particular path and decide what software removeNodeFromHead() to delete the added nodes
should be doing under a situation and decide whether the from the beginning in FIFO order. The PPHead & PPTail
software module should pass or fail on particular path. are pointers to handle the list. These are as follows.
Now we formalize the above method into algorithm and struct PPNode { struct node * N;
give its supporting data structures. First of all we need a struct node * next;
};
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 332
void cCTree(int _Max)
struct PPNode * PPHead = NULL; {
struct PPNode * PPTail = NULL; 1. addParentPointer(makeNode(NULL, _Max, 0));
2. i = 0;
void addParentPointer(node * n) 3. while (PPHead != NULL)
{ PPNode * temp = (PPNode*) malloc 4. { j = 0;
(sizeof(PPNode)); 5. while ( i < _Max)
temp->N = n; 6. { node * n = makeNode(Array[i], _Max-i-1, 1);
temp->Next = NULL; 7. n->Parent = PPHead->N;
if (PPHead == NULL && PPTail == NULL) 8. PPHead->N->Child[j] = n;
{ PPHead = temp; 9. addParentPointer(n);
PPTail = PPHead; 10. i = i + 1;
Root = n; 11. j = j + 1;
} }
else 12. j = 0;
{ PPTail->Next = temp; 13. removeNodeFromHead();
PPTail = temp; 14. i = setIndex(PPHead);
} }
} }
13. temp = tempnext;
void removeNodeFromHead() 14. removeNodeFromHead();
{ PPNode * temp = (PPNode *) malloc 15. i = setIndex(temp);
(sizeof(PPNode)); }
temp = PPHead; }
if (PPHead != NULL && PPHead->Next !=NULL)
{ PPHead = PPHead->Next; }
else
{ PPHead = NULL; } 3. Proof and analysis
free(temp);
} For a set of elements S containing n elements a
combination tree can be generated, where the elements
Another auxiliary function is used to set the index value are distinct and repetition in generated combination are
such that the element in the Array is greater than its not allowed. In order to prove that combination tree
parent in terms of lexicographical order, this is given as algorithm generates all the combination successfully and
follows. the loops terminate and the algorithm halts, we use the
loop invariance method [8], which is given as follows:
int setIndex(PPNode * T)
{ int j = 0; 3.1. Proof
char x = T->N->value;
Initialization: Prior to the beginning of the loop the link
for (int i = 0; i < Max; ++i)
list ParentPoiunterNode is empty.
{ if (x == Array[i])
{ j = i;
Maintenance: To see that, at each iteration maintains the
i = Max;
loop invariance we start with the root, that is the first
}
node that is added, i is initialized to zero and the
}
immediate child of the root gets insert into the tree as
return (j+1);
well as in the list. Once the insertion is complete we
}
remove the first node root from the list and this time the i
Last we need an array to store the distinct elements and gets the new value 1 and this time also the list is not
Max is the number of elements in array. To start creating empty but contains the new roots at next level. Once the
the tree we set head & tail of the linked list to NULL and value of i is exceeds the maximum number of elements
root of the tree to NULL. Finally the then new node are not being added to the list instead they
createCombinationTree function creates the are removed from the head.
combination tree and is given as follows.
Termination: At termination we see that node are
removed one by one as i get the value always higher then
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 333
maximum, therefore nodes are removed one by one and elements in the set S having n number of elements. We
finally the list becomes empty. have generated non repeating combination with over all
complexity of O(n2n). For the future work we should try
to establish more accurate upper bound on the algorithm
3.2. Complexity Analysis and also reduce the fixed space take by each node as the
number of child of a node in the combination tree varies,
To establish the upper bound in the proposed algorithm, these are maximum for the roots & decrease when we
to represent the worst case run time, we have to do descend in the tree, therefore memory requirement drops
approximation at various places in order to simply the and also the number of sub paths decrease.
analysis. We start by measuring the upper bound of
various auxiliary procedure used and them using them in References
the proposed algorithm for final rough estimation. The [1] Kaschner, K., Lohmann, N., Automatic Test Case
function makeNode(data), makeRootNode() and Generation for Interacting Services. In Proc. of ICSOC
setIndex(struct PPNode * T) have the complexity of 2008 Workshops. Volume 5472 of Lecture Notes in
O(n). The complexity of setIndex(struct PPNode * T) Computer Science. (2009)
is the approximate value as the complexity decreases as [2] Tony Hoare, Towards the Verifying Compiler, In The
the node starts taking it places in the tree since first time United Nations University / International Institute for
it get called it takes n units of time, second time it takes Software Technology 10th
Anniversary Colloquium: Formal Methods at the
n-1 units of time and finally it stats taking O(1) time. The
Crossroads, from Panacea to Foundational Support, Lisbon,
functions void addParentPointer(struct * node) and March 1821, 2002. Springer Verlag, 2002.
void removeNodeFromHead() take O(1) time. for the [3] Robert V. Binder, Testing Object-Oriented Systems:
algorithm createCombinationTree we start with step 1 Models, Patterns, and Tools, Addison Wesley Longman,
which takes O(n) time, step 2 takes O(1) time, step 3 has Inc., 2000.
a loop which executes taking (nC1 + nC2 + nCn-1 + nCn [4] S. S. Riaz Ahamed, " Studying the feasibility and
= 2n -1) O(2n) time, step 4 take O(2n) time, step 5 is loop importance of software testing: An Analysis", International
taking maximum time of O(n2n), 6 takes O(n) time step Journal of Engineering Science and Technology, Vol.1(3),
7-12 take O(1) individually & they are in two loops 2009, 119-128.
[5] Glenford J. Myers, The Art of Software Testing, Second
therefore take total time of O(n2n), step 13 take O(1) and
Edition, John Wiley & Sons, Inc.
finally step 14 takes total time of O(n2n). Summing up [6] B. Beizer Software Testing Techniques, Van Nostrand
the total time of each step we get Reinhold , 2nd edition, 1990.
= O(n) + O(1) + O(1) + O(2n) + O(2n) + O(n2n) + O(1) + [7] Jaroslav Nesetril, ASPECTS OF STRUCTURAL
O(1) + O(1) + O(1) + O(1) + O(1) + O(1) + O(1) + COMBINATORICS (Graph Homomorphisms and Their
O(n2n). Use), TAIWANESE JOURNAL OF MATHEMATICS
= O(n) + 10O(1) + 2O(2n) + 2O(n2n) Vol. 3, No. 4, pp. 381-423, December 1999
Ignoring constant we have [8] Thomas H Cormen, Clifford Stein, Ronald L Rivest,
= O(n2n) + O(2n) + O(n) Charles E Leiserson, Introduction to Algorithms (2001),
McGraw-Hill
= O(2n (n+1)) + O(n)
Ignoring lower order terms we have
= O(n2n)
So the approximate worst case complexity of the creating
combination tree is O(n2n).
2
Department of Computer Applications, Integral University
Lucknow, Utter Pradesh, 226026 India
3
Department of Computer Science and Engineering, Integral University
Lucknow, Utter Pradesh, 226026 India
1. Introduction
2. Proposed work
The software testing is one the most important activity in
the SDLC [4]. It authenticate whether the software being For the sake of understanding we take one example of
developed solves the intended purpose or not [2]. the requirement and demonstrate the how the test cases
Software systems continuously grow in scale and are to be generated from software requirements using
functionality [1]. Software testing confirms that combination trees. As we know there are lots of software
software being developed as per requirements [5]. At systems being developed which are GUI based. We pick
present it is mostly done manually and the test cases are one of common software requirement which is part of in
written by the tester, it is the Ad-hoc activity [3] [6]. This fact every software system which is GUI based, which is
is most error prone area as important path or case may be the user should be able to log in to the system. From
missed out by the tester [3]. The testers develop test here onwards we formalize our approach which is as
cases on the basis of the combinations of value of input follows.
parameters taken one at a time, these test cases are
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 335
2.1 Identification of classes of input Table 2. classification of inputs of buttons
As we see that there are six controls on the Login Form SN Object/ Event Embedded Action
namely two Textboxes, two Labels and two buttons. This Control procedure/funct
ion
login form is shown in the figure below (figure 1)
1 Submit Event Calls Match: If Match
Button ClickS which successful:
B matches user Go to
(6) name Home
& password (7) Page (8)
If Match
unsuccessf
ul:
Display
Message
(9)
Figure 1. Sample login form 2 Cancel Event Calls Clear All All text
Button ClickC Textboxes (11) boxes are
B cleared
Let us establish which control receives which type of (10) (12)
input from the user the UserID & Password texboxes
receive user ID & password respectively, while the labels The condition or statement represented by any number
have fixed caption for the same. The buttons Submit can be complimented as, For example we see that (1) in
and Cancel receive the click Events. On the basis of the table 1 represents that the textbox which accepts the user
classes of input controls used in the form we can separate id of the user should allow a user id greater than the
the distinct classes, over here in his case we have length six, so notation (1') means that user id is less than
textbox and Buttons. length six. We that the input that is accepted by this form
under the above requirement should have (1)(2) and
The Text input to the control textbox can be any value another statement can be generated by taking the
from the superset as the set compliment of (1)(2) which is (1')(2) which mean the
AN = {alpha-numeric characters like a-z, A-Z} input is any combination from the set AN but length is
SC = {Special characters like '$','#','!','~','*', ...)} less than six. implies that both the statements are to be
NC = {(numeric characters like 0-9)} imposed simultaneously. Now we individually take one
row from the table and put it into arrays. For table 1, row
Text = {AN, SC, NC} 1 the arrays elements are 1.2 & 1.3 and it compliment is
1'2 & 1'3. For table 1, row 2 the arrays elements are 4.5
Any input can be classified into valid & invalid class and and it compliment is 4'5. Similarly for table 2, row 1 the
the in case of text it is constraint by length possibly c1 array elements are 6.7, 8, 9 and for table 2, row 2 the
k c2, where c1 and c2 are finite and c1 c2. Now we array elements are 10.11 & 10.12. For the array we are
define the input into valid, invalid and show the desired generating a combination tree with the following
length. Now lets us give each cell a number so that it algorithm and creating an orchid with trees representing
could be differentiated with each other and handling each array. We will need a following data stricture:
becomes easy, from now onwards we will use these
numbers and to understand what they are indicating to struct node { char [ ] value ;
we have to refer the following tables. structure node *Parent;
structure node *Child [Max];
Table 1. classification of inputs of Textboxes }
SN Input Length Valid Invalid Roots is an array of node which are used to store the
different roots of the tree and is defined as follows
1 TextUID >6 (1) alpha- Special characters
numeric like{'$','#','!','~','*',...}
characters numeric characters struct Roots {
{a-z, A- like{0-9} i.e. Text - struct node * N;
Z} (2) AN(3)
struct node * next;
2 TextP > 6 (4) Text (5) - } Roots[MaxNumberOfArrays];
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 336
struct Roots * RootsHead = NULL; tempvalue = NameOfArrary;
struct Roots * RootsTail = NULL; temp Parent = NULL;
for (int i = 0; i < MAX + 1; ++i)
void addRoot(struct * node) { temp Child[i] = NULL }
{ if (RootsHead = = NULL && RootsTail = = NULL) retrun (temp);
{ RootsHead = (struct *Roots) malloc(sizeof(struct }
Roots));
RootsTail = (struct *Roots) malloc(sizeof(struct bool match(char [ ] NameOfArray)
Roots)); { struct node * temp = (struct * Roots)
RootsHeadN = node; malloc(sizeof(struct Roots));
RootsHeadnext = NULL; temp = RootsHead;
RootsTail = RootsHead; while (temp RootsTail)
} { if (tempvalue = NameOfArrary)
else { return (True) ;
{ struct Roots * temp = (struct *Roots) temp = RootsTail;
malloc(sizeof(struct Roots)); }
temp = RootsTail; temp = tempnext
temp N = node; }
tempnext = NULL; return (False);
RootsTailnext = temp; }
RootsTail = temp;
Free(temp); The linked list representation of pointers to nodes is used
} to store intermediate result. One of the advantages
} provided by this storage is that it avoids back tacking and
traversal. The size of this pointer array first increases
void removeNodeFromHead() then it starts to reduce and finally reduces to zero size in
{ if (RootsHead NULL) in n
ci , which is 2(n-1) -1.
{ struct Roots * temp = (struct *Roots)
length. This happens because of i1
malloc(sizeof(struct Roots));
temp = RootsHead;
temp = tempnext; struct ParentPointerNode { struct node * N;
RootsHead = temp; struct node * next;
} };
}
struct ParentPointerNode * ParentPointerHead = NULL;
int countRoots(struct Roots * RootsHead) struct ParentPointerNode * ParentPointerTail = NULL;
{ if (RootsHead NULL)
{ int i = 1; void addParentPointer(struct * node)
struct Roots * temp = (struct *Roots) { if (ParentPointerHead = = NULL &&
malloc(sizeof(struct Roots)); ParentPointerTail = = NULL)
temp = RootsHead; { ParentPointerHead = (struct *ParentPointerNode)
while (temp RootsTail) malloc(sizeof(struct ParentPointerNode));
{ temp = tempnext; ParentPointerTail = (struct *ParentPointerNode)
i = i +1; malloc(sizeof(struct ParentPointerNode));
}
return (i); ParentPointerHeadN = node;
} ParentpointerHeadnext = NULL;
else ParentPointerTail = ParentPointerHead;
{ return (0); } }
else
} { ParentPointerTail next = node;
ParentPointerTail = node;
struct node * makeRootNode(char [] NameOfArray) }
{ struct node * temp = (struct * Roots) }
malloc(sizeof(struct Roots));
void removeNodeFromHead()
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 337
{ if (ParentPointerHead NULL) NewNodeParent = N;
{ struct ParentPointerNode * temp = (struct NChild[k] = NewNode;
*ParentPointerNode) malloc(sizeof(struct addParentPointer(NewNode);
ParentPointerNode)); k = k + 1;
temp = ParentPointerHead; }
temp = tempnext; }
ParentPointerHead = temp; }
} tempnext = node;
} removeNodeFromHead();
}
struct node * makeNode(char [] data ) }
{ struct node * temp = (struct *ParentPointerNode) }
malloc(sizeof(struct ParentPointerNode));
tempvalue = data; This will create an orchid of as many trees equal to
temp Parent = NULL; number of arrays, since we have one array for every
for (int i = 0; i < MAX + 1; ++i) single row. The orchid is as shown the black dots
{ temp Child[i] = NULL } represent the roots of trees (see figure 2).
retrun (temp);
}
S 6.7 S | S 8 | S 9
Now we can produce all applicable rules with the
production system these are as follows
Rule 1
Figure 5. After elimination of roots
S 6 .7 S
S 6.78
Rule 2
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 339
2.6 Combining trees
We can see that if we do not reduce the combination
tress then we would have huge number of possibility and
number of test case generated will be very large. As we
have developed a control flow graph for the object under
test, if we use that then we could limit the number of
possibilities by which user can interact with the form,
with the help of this we fix the merger of tree as follows
(figure 6)
using Genetic Algorithm. Analysis of experimental results the data matrix. In web mining, there is no related work
is discussed in the Section 5. Section 6 concludes the that has been applied specific biclustering algorithms for
paper with features for future enhancements. discovering the coherent browsing patterns.
r _ col kl and is the correlation between column k and algorithm. Second step is to enlarge and refine these seeds
using greedy search procedure which results in local
column l. A high ACV suggests high similarities among optimum. Third step is to obtain global optimum of
the users or pages. ACV can tolerate translation as well as biclusters using evolutionary technique called genetic
scaling. And also works well for biclusters in which algorithm. These overlapped coherent biclusters have high
theres a linear correlation among the users or pages. degree of correlation among subset of users and subset of
related pages of a web site.
3.4. Greedy Search Procedure
A greedy algorithm repeatedly executes a search This algorithm identifies the coherent browsing
procedure which tries to maximize the bicluster based on pattern from the web usage data which plays vital role in
examining local conditions, with the hope that the the direct marketing and target marketing. One-to-one
outcome will lead to a desired outcome for the global relation between web users and pages of a web site is not
problem. This approach employs simple strategies that are appropriate because web users are not strictly interested in
easy to implement and most of the time quite efficient. one category of web pages. Therefore, the proposed
algorithm is tuned to discover the overlapping coherent
Structure of Greedy Search Procedure biclusters from clickstream data patterns.
Step 1: Start with initial bicluster.
Step 2: For every iteration 4.1 Bicluster Formation using K-Means Algorithm
Add/ remove the element(user/page) to/from the bicluster
which maximize the objective function. In this paper, K-Means clustering method is
End for applied on the web user access matrix A(U, P) along both
dimensions separately to generate ku user clusters and kp
In this paper, objective function is to maximize ACV of a page clusters .And then combine the results to obtain small
bicluster. co-regulated submatrices (ku kp) called biclusters. These
correlated biclusters are called seeds.
3.5 Encoding of Biclusters
Each enlarged and refined bicluster is encoded as 4.2 Enlargement and Refinement of Bicluster Using
a binary string .The length of the string is the number of Greedy Search Procedure
rows plus the number of columns of the user access matrix
A (U, P). A bit is set to one when the corresponding user
In this step, seeds are enlarged and refined by
or page is included in the bicluster. These binary encoded
adding /removing the rows and columns to enlarge their
biclusters are used as initial population for genetic
volume and improve their quality respectively. The main
algorithm.
goal of the greedy search procedure is to maximize the
volume of the bicluster seed without degrading the quality
3.6 Volume of Bicluster measure.
The number of elements in bicluster B (I, J) is
called the volume of bicluster B (I, J) and denoted as Here, ACV is used as merit function to grow the
VOL (B (I, J)). seeds. Insert/Remove the users/pages to /from the bicluster
if it increases ACV of the bicluster.
VOL (B (I, J)) = |I| |J| (3)
where, |I| is the number of users in the B and |J| is the Algorithm 1: Seed Enlargement and Refinement using
number of pages in B. Greedy Search Procedure
Input: User Access Matrix A
Output: Set of enlarged and refined biclusters
4. Coherent Biclustering Approach Using
Evolutionary Algorithm Step 1. Compute ku user clusters and kp page
clusters from preprocessed clickstream data.
The proposed algorithm is used to identify the Step 2. Combine ku and kp clusters to form ku
optimal coherent biclusters in terms of volume and quality kp biclusters called seeds.
in three subsequent steps. First step is to identify the initial Step 3. For each seed do
biclusters called seeds by using K-Means clustering Call Seed Enlargement(Seed(U, P))
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 344
Step 4. Return the optimal bicluster The metric index R is used to evaluate the overlapping
degree between biclusters. It quantifies the amount of
overlapping among biclusters. Degree of overlapping[7],
Selection: The most commonly used form of GA selection is used as quantitative index to evaluate quantitatively the
is Roulette Wheel Selection (RWS), is used for the quality of generated biclusters. The degree of overlapping
selection operator. When using RWS, a certain number of among all biclusters is defined as follows
biclusters of the next generation are selected
probabilistically, where the probability of selecting a 1 |U | | P|
A C V o f C o c lu s t e r s
PI ACV
Each bicluster seed underwent four stages of seed 0.6 UI ACV
enlargement and refinement step. During each stage their PD ACV
UD ACV
ACV is incremented which is shown in Fig 2. Since the 0.4
quality of the bicluster is more important than the volume,
the volume adjusted in order to achieve the high ACV in 0.2
various stages of the second phase which is portraits in
Fig1.
0
1 2 3 4 5 6 7 8 9 10
Cocluster Index
To avoid random interference, very tightly correlated
biclusters obtained using greedy search procedures are
used as initial population for GA. Moreover, it results in Fig 2. ACV of Biclusters in Various Stages
quick convergence and provides number of potential
biclusters. These biclusters have high ACV and high Table 4: Performance of Biclustering using GA
volume which is obvious from table 4. This approach Mean Mean Row Column Overlapp
shows excellent performance at finding high degree of Volume ACV Percent- percent- -ing
overlapped coherent biclusters from web data. age age Degree
12715 0.9609 99.9 82.35 0.2152
5000
Two-Way K- 494.9 0.4711 0
Means
4000 Greedy 1599.8 0.9413 0.0192
Search
3000
Procedure
2000 Genetic 12715 0.9609 0.2152
Algorithm
1000
not be close, but the pattern they exhibit can be very much on Intelligent Systems Design and Applications, vol. 1, pp:281-286,
2008
similar. Our proposed biclustering frame work is [7] Das C, Maji P , Chattopadhyay S, A Novel Biclustering Algorithm
interested in finding such coherent patterns of bicluster of for Discovering Value-Coherent Overlapping -Biclusters,
users and with a general understanding of users browsing Advanced Computing and Communications, pp:148-156,2008.
interest. This method makes significant contribution in the [8] Dhillon IS, Mallela S, Modha DS. Information-theoretic co-
clustering. In: Proceedings of the ninth ACM SIGKDD
field of web mining, E-Commerce applications and etc. international conference on knowledge discovery and data
mining(KDD). pp: 8998, 2003.
From the results, it is obvious that it correlates [9] Getz G, Levine E, Domany E. Coupled two-way clustering analysis
of gene microarray data, PNAS ,pp:12079-12079,2000.
the relevant users and pages of a web site in high degree [10] Guandong Xu, Yu Zong, Peter Dolog and Yanchun Zhang, Co-
of homogeneity. Analyzing these overlapping coherent clustering Analysis of Weblogs Using Bipartite Spectral Projection
biclusters could be very beneficial for direct marketing, Approach, Knowledge-Based and Intelligent Information and
Engineering Systems, Lecture Notes in Computer Science, Vol:
target marketing and also useful for recommending 6278, pp: 398-407, 2010.
system, web personalization systems, web usage [11] Hartigan JA., Direct clustering of a data matrix, Journal of the
categorization and user profiling. The interpretation of American Statistical Association ,pp:1239,1972.
[12] KlugerY, Basri R, Chang JT, Gerstein M., Spectral biclustering of
biclustering results is also used by the company for
microarray data: biclustering genes and conditions,Genome
focalized marketing campaigns to improve their Research ,pp:703-716,2003.
performance of the business. [13] Koutsonikola, V.A. and Vakali, A. ,A Fuzzy bi-clustering approach
to correlate web users and pages, Int. J. Knowledge and Web
Intelligence, Vol. 1, No. 1/2, pp.323, 2009.
CONCLUSION [14] Sandro Araya, Mariano Silva, Richard Weber, A methodology for
web usage mining and its application to target group identification,
The main contribution of this paper is twofold Fuzzy Sets and Systems, pp:139152,2004.
[15] Srivastava, J., Cooley R., Deshpande, M., Tan, P.N., Web Usage
namely, development of coherent biclustering framework Mining: Discovery and Applications of Usage Patterns from Web
using GA to identify overlapped coherent biclusters from Data. SIGKDD Explorations, Vol. 1, No. 2, pp:12-23, 2000.
the clickstream data patterns and a coherence quality [16] Sujatha N, Iyakutty K, Refinement of Web usage Data Clustering
from K-means with Genetic Algorithm, European Journal of
measure called ACV is used to get coherent biclusters in Scientific Research,Vol.42, No.3 pp:478-490,2010.
last two phases of the biclustering framework. The [17] Tang Cand Zhang A, Interrelated Two-Way Clustering: An
interpretation of the biclustering results can also be used Unsupervised Approach for Gene Expression Data Analysis, Proc.
Second IEEE Int'l Symp. Bioinformatics and Bioeng., Vol. 14,
towards improving the websites design, information
pp:41-48, 2001.
availability and quality of provided services. The
overlapping nature of the proposed framework can
significantly contribute towards this direction. This
method has potential to identify the coherent patterns
automatically from the clickstream data. Future work
aims at extending this framework by enriching clustering
process would result to enhanced clusters quality and a
more accurate definition of relation coefficients.
REFERENCES
[1] Bagyamani, J., Thangavel, K.,SIMBIC: similarity based biclustering
of expression data. Communications in Information Processing and
management, Book chapter, Springer, vol: 70, pp: 437441,2010.
[2] Bleuler, S., Prelic, A., Zitzler, E.: An EA framework for biclustering
of gene expression data. In: Congress on Evolutionary Computation
CEC2004, vol:1, pp:166173, 2004.
[3] Busygin S, Jacobsen G, Kramer E, Double conjugated clustering
applied to leukemia microarray data. SIAM data mining workshop
on clustering high dimensional data and its applications, 2002.
[4] Cho H, Dhillon IS, GuanY, Sra S. Minimum sum-squared residue
co-clustering of gene expression data. In: Proceedings of the fourth
SIAM international conference on data mining, 2004.
[5] Chakraborty, A., Maka, H, Biclustering of gene expression data
using genetic algorithm, IEEE Symposium on Computational
Intelligence in Bioinformatics and Computational Biology, pp:1-8,
2005.
[6] Chu-Hui Lee, Yu-Hsiang Fu, Web Usage Mining Based on
Clustering of Browsing Features, Eighth International Conference
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 348
The remainder of this paper is organized as follows: Three potential issues have been identified in this process.
Section 2 reviews the processes defined in HiLOW The first issue involving this protocol is that the nodes are
protocol briefly and highlights the issues pertaining the assumed to communicate using maximum power
protocol and other works undertaken to improve HiLOW. transmission. Using maximum power transmission to
Section 3 explains in detail the proposed power selection communicate to its parents node is not advantageous.
method. Section 4 presents the conclusion. This method could lead towards enhanced power drainage
of a child node. For example in a scenario that a child
communicates with a parent using maximum power
2. HiLOW Protocol and Existing Issues transmission (power level 10) even though it could
communicate via lower transmission (power level 5) then
A hierarchical routing protocol (HiLow) for 6LoWPAN its power drainage is heightened by nearly 50%. So in this
was introduced by K. Kim in 2007 [11]. HiLOW exploits paper we are proposing a power selection method during
the dynamic 16-bits short address assignment capabilities the routing tree setup by implementing binary search
of 6LoWPAN. HiLOW makes an assumption that the algorithm with LQI value as qualifier. The proposed
multi-hop routing occurs in the adaptation layer by using method is expected to reduce power wastage and heighten
the 6LoWPAN Message Format. The operations in the network lifetime.
HiLOW ranging from the routing tree setup operation up
to the route maintenance operation and the issues The second issue would be when the child node gets
revolving each operation level will be discussed in the rest respond from more than one potential parent. There is no
of the section. clear mechanism rolled out in selecting the suitable parent
to attach with. If the new node chooses to join the first
responding parent node, it could be bias to the parent as
2.1 Hilow Routing Tree Setup, Issues and Other some parent might be burdened with more parents
Works done. meanwhile other parents which is in the same level has
less child or none at all. Selecting the parent based on first
The process of setting up the routing tree in HiLOW responded potential parent could also lead to fast depletion
consists of a sequence of activities. The process is started of energy to certain parent causing the life span of the
by a node which tries to locate an existing 6LoWPAN network to be shorter and the stability to be jeopardized.
network to join into. The new node will either use active Selection of parent without considering the link quality
or passive scanning technique to identify the existing could cause towards high retransmission rate which will
6LoWPAN network in its Personal Operation Space consume energy from the child node as well as parent
(POS). node.
In [15] a mechanism to overcome the issue was suggested.
If the new node identifies an existing 6LoWPAN it will Their mechanism suggests the potential parent node to
then find a parent which takes it in as a child node and provide the new child with its existing child node count
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 350
(child_number). By issuing the child_number the node shown in [10]. So far no issues have been identified in this
could select suitable parent which has less child nodes. process.
The suggested mechanism performs well only when the
potential parent node has same depth, same energy level
and has different number of existing child. Their SA : Set of Ascendant nodes of the destination node
mechanism also does not take into consideration the
quality of the link established between the parent node and SD : Set of Descendant nodes of the destination node
child node. Therefore the suggested mechanism does not AA(D,k): The address of the ascendant node of depth D
solve the arising issue completely. In order to overcome of the node k
the weakness in the selection method a comprehensive
parent selection method that takes into consideration the DC : The depth of current node
link quality, the existing energy of the potential parent as C : The current node
well as the depth of the parent has been proposed in [16].
The paper theoretically discusses how the proposed Case 1: C is the member of SA (3)
method could overcome bias child attachment in different
scenarios. The next hop node is AA (DC+1, D)
Third issue revolves around the MC value which is being Case 2: C is the member of SD
fixed for all nodes. The current scenario works well in a
homogenous powered sensor environment where all the The next hop node is AA (DC-1, C)
sensors power source is the same; for example all is
battery powered with same type of battery or all sensors Case 3: Otherwise
are non-battery powered and having same power source.
Meanwhile in a heterogeneous power source sensor The next hop node is AA (DC-1, C)
environment this method is not advantageous as sensors
which are main power and affluent in energy should be 2.3 Route Maintenance in HiLOW
assigned with more child compared to battery powered
sensor. This is an open issue to be addressed in HiLOW Each node in HiLOW maintains a neighbor table which
and assumption that all nodes having same energy contains the information of the parent and the children
conservation have to be made. The activity of node. When a node loses an association with its parent, it
disseminating the MC value to joining nodes is also left in should to re-associate with its previous parent by utilizing
gray. This issue is not addressed in this paper. the information in its neighbor table. In the case of the
association with the parent node cannot be recovered due
to situation such as parent nodes battery drained, nodes
2.2 Routing Operation in HiLOW mobility, malfunction and so on, the node should try to
associate with new parent in its POS [11]. Meanwhile if
Sensor nodes in 6LoWPAN can distinguish each other and the current node realizes that the next-hop node regardless
exchange packet after being assigned the 16 bits short whether its child or parent node is not accessible for some
address. HiLOW assumes that all the nodes know its own reason, the node shall try to recover the path or to report
depth of the routing tree. The receiving intermediate nodes this forwarding error to the source of the packet.
can identify the parents node address through the defined
formula (2). The [] symbol represents floor operation Even though a route maintenance mechanism has been
defined in HiLOW, the mechanism is seen as not sufficient
AC : Address of Current Node to maintain the routing tree. An Extended Hierarchical
Routing Over 6LoWPAN which extends HiLOW was
MC : Maximum Allowed Child presented by in [16] in order to have better maintained
routing tree. They suggested two additional fields to be
AP = [(AC-1) / MC] (2) added to the existing routing table of HiLOW namely,
Neighbour_Replace_Parent (NRP) and
By using the above formula the receiving intermediate Neighbour_Added_Child (NAC). This NRP doesnt point
nodes can also identify whether it is either an ascendant to the current parent node but to another node which can
node or a descendant node of the destination. When a node be its parent if association to current parent fails.
receives a packet, the node determines the next hop node Meanwhile NAC refers to the newly added child node.
to forward the packet by following the three cases (3) as More work need to be done on this mechanism on how
many nodes allowed to be adapted by a parent node in
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 351
addition to the defined MC and whether this mechanism POS as shown in Fig. 1. Before starting a scan the node
will have any impact on the routing operation, however needs to determine and the Lowest Transmit Power (LP)
this topic is beyond the scope of this paper. and Highest Transmit Power (HP). The node then sets the
Optimum Transmit Power (OP) value to be equivalent to
HP. Then the Search Transmit Power (SP) value has to be
3. Transmission Power Level Selection determined. The SP value is determined following
Method for HiLOW mathematical equation in (4). The Current Search (CR)
value is also set to 0.
A transmission power level selection method by
implementing binary search algorithm coupled with Table 2: Default Power Mapping in Atmel Raven Sensor Nodes [19]
maximum search round and LQI value as qualifier is being
presented in this paper. The suggested method is able to
reduce number of nodes communicating using maximum TX Output Power[dBm]
transmission power with its parent node, by doing so the Power
energy used in transmission is reduced and network Setting
lifetime is heightened. 0 3
1 2.6
Binary search method coupled with maximum is selected 2 2.1
compared to incremental search or pure binary search in 3 1.6
order to reduce the number of rounds the nodes undergoes 4 1.1
to search for parent. Table 1 displays the number of 5 0.5
maximum search rounds in worst case scenario which is 6 -0.2
possible based on the different number of power levels for
7 -1.2
three different searches. From the table it can be easily
8 -2.2
deducted that Binary Search Algorithm is more efficient in
worse case scenarios. Meanwhile the mechanism 9 -3.2
suggested in this paper ensures that the number of search 10 -4.2
is even more limited as the energy is very crucial for 11 -5.2
sensor nodes. 12 -7.2
13 -9.2
An assumption that all the nodes have mapped their Tx 14 -12.2
Power Setting to output power and the Tx Power setting is 15 -17.2
incremental by 1 from each other, for example as set by
default in Atmel Raven Nodes as shown in Table 2. An Two values are to be set during compile time; one is
assumption that different power level setting uses different Maximum Search Round (MR), the MR value has to be set
battery consumption is also made for example in Atmel for all nodes and the value should be same for all nodes.
Raven when the output Power is 0dBm the amount of MR is basically the number of maximum search round the
battery power is quoted to be less than 13mA and in the nodes can go through before they terminate the search.
case of full output power ( -17 dBm) the battery Second value is the accepted LQI value.
consumed battery power is 16-17mA.
Table 1: Maximum Search Rounds in worst case scenario for three SP : Search Transmit Power Level
different type of search method
HP : Highest Transmit Power Level
Power Incremental / Binary Search Suggested LP : Lowest Transmit Power Level
Level (N) Linear Search (log2N) Search (MR
=4) SP = [ ( (HP - LP)-1) / 2 (4)
5 5 3 3
The node then will use the SP to search for the potential
10 10 4 4 parent.
20 20 5 4
In the case 1: Where the node does not find any potential
30 30 5 4
parent it will set the LP value to be SP value.
40 40 6 4
The transmission power selection process starts when a
node starts when the node looking to join a network in it
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 352
Regardless which ever case the node encountered, the
node will then continue to the same process which is
increment CR by 1, then determine if the CR more than
MR. If the condition is true then the node terminates the
search process and set the transmission power level to be
equivalent to OP. In the case the condition is not true then
the SP is again determined, then the new SP is compared
with the LP to ensure that is higher than LP if it is not then
the process is also terminated and the transmission power
level is set to be equivalent to OP. In the case the condition
is true then the process loops back to the process for a
parent using the SP power level.
4. Conclusions
In this paper review on HiLOW, issues revolving each
process in HiLOW and other works done in this area are
presented. A new idea on transmission power level
selection method by implementing binary search algorithm
coupled with maximum search round and LQI value as
qualifier is presented in this paper. The presented power
level selection method is believed to be able to overcome
the problem of maximum power usage for every
transmission; by which the network lifetime could be
increased. The presented power selection method is also
better than linear search method and pure binary search
method as discussed in our paper as it has it exits search in
fixed number of rounds compared to the latter. Even
though the method is suggested for HiLOW, the method
could be easily adapted to other type of hierarchical
routing. Our future research will be focused on validating
the suggested mechanism as well as adapting it to other
routing protocols such as LEACH.
Acknowledgments
References
[1] [1] Lee, S.hyun. & Kim Mi Na., Wireless Sensors for Home
Monitoring - A Review, Recent Patents on Electrical Engineering,
Fig. 1 Proposed power level selection method for HiLOW 2008, Vol. 1, No. 1, pp32-39.
[2] Ian F.Akyildiz, Weilian Su, Yogesh Sankarasubramaniam and Erdal
Cayirci, A Survey on Sensor Networks, Communication
Magazine, IEEE, Volume 40, 2002.
In the case 2: Where the node does find a potential parent [3] A. Dunkels, F. Osterlind, N. Tsiftes, and Z. He, Software-based
it will compare the LQI value with the accepted LQI value. online energy estimation for sensor node , Proceedings of the
In the case where the LQI value is more than accepted LQI Fourth IEEE Workshop on Embedded Networked Sensors (Emnets
then the HP will be set to equal with SP and OP will also IV), Cork Ireland, 2007.
be set to equal SP. In the other case the LP value will be [4] [2] L.Martin, B.D Mads, B.Philippe, (2003) Bluetooth and
sensor networks: a reality check, Proceedings of the 1st
set to SP value and the OP value remain unchanged. international conference on Embedded networked sensor systems,
2003
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 353
[5] S.Jianping, et al., WirelessHART: Applying Wireless Technology Mr. Selvakumar Manickam obtained his
in Real-Time Industrial Process Control, In Proceedings of IEEE Bachelor of Computer Science and Master of
Real-Time and Embedded Technology and Applications Computer Science from Universiti Sains
Symposium, 2008. Malaysia in 1999 and 2003 respectively. He is
[6] B.Chiara, C.Andrea, D.Davide, V.Roberto, An Overview a lecturer and domain head of industrial &
onWireless Sensor Networks Technology and Evolution, Sensors community linkages of the National Advanced
2009, Sensors 2009, 9, 6869-6896; doi:10.3390/s90906869 IPv6 Centre of Excellence (NAV6) in Universiti
Sains Malaysia. His research areas are
[7] N. Kushalnagar, et al., Transmission of IPv6 Packets over IEEE
information architecture, network technology
802.15.4 Networks, rfc4944, September 2007.
and management as well as IPv6 in Bioinformatics.
[8] IEEE Computer Society, 802.15.4-2006 IEEE Standard for
Information Technology- Telecommunications and Information
Exchange Between Systems- Local and Metropolitan Area Kok-Soon Chai is a certified Project
Networks- Specific Requirements Part 15.4: Wireless Medium Management Professional by Project
Access Control (MAC) and Physical Layer (PHY) Specifications Management Institute, USA. He received his
for Low-Rate Wireless Personal Area Networks (WPANs) MSc and Ph.D. (2003) degrees from the
[9] K. Kim, S.Yoo, S.Daniel, J.Lee, G.Mulligan, Problem Statement University of Warwick, UK. He worked for
and Requirements for 6LoWPAN Routing, draft-ietf-6lowpan- more than seven years as a senior R&D
routing-requirements-02, March 2009 software engineer, embedded software
[10] K. Kim, S.Yoo, S.Daniel, J.Lee, G.Mulligan, Commisioning in manager, and CTO at Motorola, Agilent,
6LoWPAN, draft-6lowpan-commisioning-02, July 2008 Plexus Corp., Wind River in Singapore (now a
division of Intel Corp.), and NeoMeridian. He holds one US patent,
[11] K. Kim, et al., Hierarchical Routing over 6LoWPAN (HiLOW), with two US patents pending. His main interests are wired and
draft-daniel-6lowpan-hilow-hierarchical-routing-01, June 2007. wireless sensor networks, green technology, embedded systems,
[12] K. Kim, et al., Hierarchical Routing over 6LoWPAN (HiLOW), consumer electronics, and real-time operating systems. Dr. Chai is
draft-daniel-6lowpan-hilow-hierarchical-routing-00, June 2005. a senior lecturer at the National Advanced IPv6 Centre of
[13] K. Kim, G.Montenegro, S.Park, I.Chakeres, C.Perkins, Dynamic Excellence (NAV6) in Universiti Sains Malaysia
MANET On-demand for 6LoWPAN (DYMO-low) Routing, draft-
montenegro-6lowpan-dymo-low-routing-03, June 2007.
[14] K.Kim, S.Daniel, G.Montenegro, S.Yoo, N.Kushalnagar,
6LoWPAN Ad Hoc On-Demand Distance Vector Routing
Sureswaran Ramadass obtained his
(LOAD, draft-daniel-6lowpan-load-adhoc-routing-02, March 2006
BsEE/CE (Magna Cum Laude) and Masters
[15] K.Kim, S.Daniel, G.Montenegro, S.Yoo, N.Kushalnagar, in Electrical and Computer Engineering from
6LoWPAN Ad Hoc On-Demand Distance Vector Routing the University of Miami in 1987 and 1990,
(LOAD, draft-daniel-6lowpan-load-adhoc-routing-02, March 2006 respectively. He obtained his Ph.D. from
[16] Hun-Jung-Lim, Tai-Myoung Chung, The Bias Routing Tree Universiti Sains Malaysia (USM) in 2000 while
Avoiding Technique for Hierarchical Routing Protocol over serving as a full-time faculty in the School of
6LoWPAN, 2009 Fifth International Joint Conference on INC, Computer Sciences. Dr. Sureswaran
IMS and IDC. Ramadass is a Professor and the Director of
[17] C.Nam, H.Jeong, D.Shin, Extended Hierarchical Routing Protocol the National Advanced IPv6 Centre of Excellence (NAV6) in
over 6LowPAN, MCM2008, September 2008. Universiti Sains Malaysia.
[18] C.Lingeswari et al., Bias Child Node Association Avoidance
Mechanism for Hierarchical Routing Protocol in 6LoWPAN,
Proceedings of the Third IEEE International Conference on
Computer Science and Information Technology
[19] AVR2002 : Raven Radio Evaluation Software
[20] Zhu Jian, Zhao Lai, A Link Quality Evaluation Model in Wireless
Sensor Networks, Proceedings of the 2009 Third International
Conference on Sensor Technologies and Applications
2
ME, Software Engineering,
Department of Computer Science and Engineering,
PSG College of Technology, Coimbatore-641 004, India
Abstract
Cloud Computing is an attractive concept in IT field, since it allows developers and finally Infrastructure as a service provides
the resources to be provisioned according to the user needs[11]. It informatics resources, such as servers, connections, storage and
provides services on virtual machines whereby the user can share other necessary tools to construct an application design
resources, software and other devices on demand. Cloud services are prepared to meet different needs of multiple organizations,
supported both by Proprietary and Open Source Systems. As making it quick, easy and economically viable [4].
Proprietary products are very expensive, customers are not allowed to
experiment on their product and security is a major issue in it, Open
source systems helps in solving out these problems. Cloud Computing
Cloud computing is mainly classified into three types based on
motivated many academic and non academic members to develop the deployment model; Public cloud, Private cloud and Hybrid
Open Source Cloud Setup, here the users are allowed to study the cloud. If the services are provided over the internet then it is
source code and experiment it. This paper describes the configuration public cloud or external cloud and if it is provided with in an
of a private cloud using Eucalyptus. Eucalyptus an open source organization through intranet then it is named as private cloud
system has been used to implement a private cloud using the hardware or internal cloud and Hybrid cloud is an internal/external cloud
and software without making any modification to it and provide which allows a public cloud to interact with the clients but
various types of services to the cloud computing environment. keep their data secured within a private cloud [7].
Keywords: Cloud Computing, Open Source, Private Cloud. This paper explains about EUCALYPTUS: an open-source
system that enables the organization to establish its own cloud
computing environment. Eucalyptus is structured by various
1. Introduction components which interact with each other through well-
defined interfaces. It is used for implementing on-premise
Cloud computing is a computing environment, where resources private and hybrid clouds using the hardware and software
such as computing power, storage, network and software are infrastructure that is in place, without modification.
abstracted and provided as services on the internet in a
remotely accessible fashion. Billing models for these services
2. Eucalyptus
are generally similar to the ones adopted for public utilities.
On-demand availability, ease of provisioning, dynamic and
Eucalyptus (Elastic Utility Computing Architecture for Linking
virtually infinite scalability is some of the key attributes of
cloud computing [6]. Your Programs To Useful Systems) was released in May 2008,
creator of the leading Open-Source Private Cloud platform.
The main concept behind cloud computing is providing They were incorporated as an organization in January 2009
services. It provides various types of services, some of the Headquartered in Santa Barbara, California.
important services are SaaS, PaaS and IaaS. Software as a
service is a model of software deployment whereby a provider Eucalyptus software is available under GPL (General Public
licenses an application to customers for use as a service on License) that helps in creating and managing a private or even
demand. Platform as a service generates all facilities required a publicly accessible cloud. It provides an EC2 (Elastic
to support the complete cycle of construction and delivery of Compute Cloud)-compatible cloud computing platform and S3
web-based applications wholly available in Internet without the (Simple Storage Service)-compatible cloud storage platform.
need of downloading software or special installations by
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 355
Eucalyptus is one of the key for open source cloud platforms
which makes it much popular. The client tools used for
Eucalyptus is same as that of AWS, because Eucalyptus
services are available through EC2/S3 compatible APIs [6].
3. UEC Architecture
Ubuntu Enterprise Cloud UEC, is a private cloud set up for
developing its our own IT infrastructure. UEC comes up with
many open source software and Eucalyptus is one among them
and it makes the installation and configuration of the cloud
easier. Canonical also provides commercial technical support
for UEC.The basic architecture of UEC consists of A front end
which runs one or more Cloud Controller (CLC),Cluster
Controller (CC),Walrus (WS3), Storage Controller (SC) and
One or more nodes[6]. The architecture of UEC is shown in
Fig 2. A CLC manages the whole cloud and includes multiple
CCs. There will be a WS3 attached to a CLC.A CC can
contain multiple NCs and SCs. Ultimately the VMs will be
running in the NC making use of its physical resources [5]. Fig.3.UEC basic setup with Three Machines [6].
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 356
5. Steps in Configuring an Open Source Private Invoke the Web Interface
Cloud Login to the web https://192.168.4.145:8443/
interface of CLC The default usern ame is
admin" and the d efault
Steps Description/commands password is admin".
Download the From https://192.168.4.145:8443/
Installation Procedure for Server 1 credentials #credentials and save it i n the
Install Ubuntu Boot the Server off for ~/.euca directory
Server 10.04 CD in installation $ cd .euca
Server 1 Extract the $ unzip mycreds.zip
Setup the IP 192.168.4.145. ( Please do t hat credentials archive
address details. for eth0) Source eucarc $ . ~/.euca/eucarc
Cloud Controller Leave this blank as Server1 is Verify euca2ools $euca-describe-availability-zones
Address the Cl oud Controller in this communication verbose
setup with UEC
Running Instances
Cloud Installation Select Cloud controller", Installing Images From Canonical over the
Mode Walrus storage service", internet (no proxy), chec k Store
Cluster controller" and tab.
Storage controller". Checking the $ euca-describe-images
Network interface Select eth1 node available Images
for communication
Installing a $ euca-add-keypair mykey >
Eucalyptus cluster Cluster 1 Keypair ~/.euca/ mykey.priv
name
$ chmod 0600
Eucalyptus IP 192.168.4.155-192.168.4.165 ~/.euca/mykey.priv
range Running an $ euca-run-instances -g Ubuntu
Instance ( using 9.10 -k mykey -t c1.medium emi-
Installation Procedure for Server 2 terminals) E08810 7E
Install Ubuntu Boot the Server off for
Server 10.04 CD in installation Hybridfox Used to run the ins tances using
Server 2 GUI
Setup the IP Please do that for eth0 by setting Life cycle of an Pending - Running - Shutting
address for one up the private IP - 192.168.4.146 Instance down Terminated Reboot.
interface $euca-run-instances
Cloud Controller 192.168.4.145 $euca-terminate-instances
Address $euca-reboot-instances
Cloud Installation Select Node Controller" Table.1. Configuration Steps
Mode
192.168.4.145 (IP of the CC)
Gateway
6. ALGORITHM
Installation Procedure for Client 1 6.1 Installing server1
Install Ubuntu Boot the Desktop off for
Desktop 10.04 CD installation 1. Boot the server off the Ubuntu Server 10.04 CD. At the
in Client graphical boot menu, select Install Ubuntu Enterprise Cloud"
IP Address The Desktop will be o n the and proceed with the basic installation steps.
enterprise network and will 2. Installation only lets you set up the IP address details for
obtain an IP address through one interface. Please do that for eth0.
DHCP 3. We need to choose certain configuration options for UEC,
Install KVM To help us to install images on during the course of the install.
KVM platform and bundle them 4. Cloud Controller Address - Leave this blank as Server1 is the
Cloud Controller in this setup.
5. Cloud Installation Mode - Select Cloud controller", Walrus
storage service", Cluster controller" and Storage controller".
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 357
6. Network interface for communication with nodes - eth1 6. To verify that euca2ools are able to communicate with the
7. Eucalyptus cluster name cluster1 UEC, try fetching the local cluster availability details shown in
8. Eucalyptus IP range - 192.168.4.155-192.168.4.165 [6]. Fig.4.
$ euca-describe-availability-zones verbose
6.2 Installing server 2
1. Boot the server off the Ubuntu Server 10.04 CD. At the
graphical boot menu, select Install Ubuntu Enterprise Cloud"
and proceed with the basic installation steps.
2. Installation only lets us to set up the IP address for one
interface. Please do that for eth0 by setting up the private IP -
192.168.4.146.
3. Then choose certain configuration options for UEC, during
the course of the install. Ignore all the settings, except the
following:
4. Cloud Controller Address - 192.168.4.145
5. Cloud Installation Mode - Select Node Controller"
6. Gateway - 192.168.4.145 (IP of the CC) [6]. Fig .4 Snapshot for list of Available Resources
6.3 Installing Client 1 7. If the free/max VCPUs are set as 0 in the above list, it means
that the node did not get registered automatically. Use the
The purpose of Client1 machine is to interact with the cloud following on Server1 and approve when prompted to add
setup, for bundling and registering new Eucalyptus Machine 192.168.4.146 as the Node Controller:
Images (EMI). $sudo euca_conf --discover-nodes [6].
1. Boot the Desktop off the Ubuntu Desktop 10.04 CD and
install. The Desktop will be on the enterprise network and will
obtain an IP address through DHCP. 7. Running Instances
2. Install KVM to help us to install images on KVM platform
7.1 Installing Cloud Images
and bundle them:
$apt_get install qemu_kvm [6]. No images exist by default in the Store (web Interface).
Running an instance or VM in the cloud is only based on
image. Images can be installed directly from Canonical online
6.4 Algorithm for Invoking the Web Interface cloud image store or we can also build custom image, bundle
it, upload and register them with the cloud. The Store tab in
1. Login to the web interface of CLC by using the following the web interface will show the list of images that are available
link https://192.168.4.145:8443. The default username is from Canonical over the internet [6].
admin" and the default password is admin".
2. Note that the installation of UEC installs a self signed
certificate for the web server. The browser will warn us about
the certificate not having been signed by a trusted certifying
authority. Authorize the browser to access the server with the
self signed certificate.
3. When you login for the first time, the web interface prompts
to change the password and provide the email ID of the admin.
After completing this mandatory step, download the credentials
archive from https://192.168.4.145:8443/ #credentials and save
it in the ~/.euca directory.
4. Extract the credentials archive:
$ cd .euca
$ unzip mycreds.zip
5. Source eucarc script to make sure that the environmental
variables used by euca2ools are set properly.
$ . ~/.euca/eucarc Fig.5. List of Images from Store
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 358
7.2 Checking Images 7.5 Hybridfox
Hybridfox provide compatibility between Amazon Public
euca-describe-images is the command-line equivalent of
cloud and Eucalyptus Private Cloud [9]. Hybridfox tool is a
clicking the Images tab in the Eucalyptus administrative web
modified or extended elasticfox that enables us to switch
interface. This shows the emi-xxxxxx identifier for each
image/bundle that will be used to run an instance. seamless between different cloud clusters in order to manage
the overall cloud computing environment. Hybridfox can
$ euca-describe-images perform all the functions that can be done by elasticfox, on the
Eucalyptus Computing environment like Manage Images,
IMAGE emi-E088107E image-store-1276733586/ image. Raise and Stop Instances, Manage Instances, Manage Elastic
manifest.xml admin IPs, Manage Security Groups, Manage Key pairs and Manage
available public x86_64machine eki-F6DD1103 eri-0B3E1166 Elastic Block Storage[3].Running a different instance by using
IMAGE eri-0B3E1166 image-store-1276733586/ ramdisk. Hybridfox is shown below in the Fig.6.
manifest.xml admin
available public x86_64ramdisk
IMAGE eki-F6DD1103 image-store- 1276733586/ kernel.
manifest.xml admin
available public x86_64kernel
9. Conclusion
Cloud computing is an everlasting computing environment
where data are delivered on-demand to authenticated devices
in a secured manner and users utilize a shared and elastic
Infrastructure. This paper briefly explains the set up of a
private cloud in a cluster based environment using open source
Fig.7. Life Cycle of an Instance [6]. technologies like Eucalyptus, KVM, and euca2ools. The virtual
machine images are available in the cloud and upon user
request; its instances are created and run. Services were
8. Future Scope included successfully and made available to the user. The
current implementation of this paper provides Infrastructure as
Types of Services a service (IaaS) and Software as a Service (SaaS).
A cloud can provide service either to private or public cloud.
In public cloud, based on demand the services are provided to References
the client and in a private cloud the service is provided to a [1] Cloud Computing (2010), Wikepedia;en.wikipedia.org/
single client [10]. The combination of both public and private wiki/
cloud is called hybrid private cloud, here the private cloud is [2] Dr. Rich Wolski, (2010) Enterprise Cloud Control.
hosted in a public cloud. Services that are included to the cloud [3] Ezhil Arasan Babaraj, (2009), Driving Technology
setup are listed in Table.2. Direction on Cloud Computing Platform, Blog post;
Hybridfox: Cross of Elasticfox and Imagination, ezhil.sys-
con. com/.
[4] Glossary, (2010), MasterBase, www.en.masterbase.com/
support/ glossary.asp.
[5] Installing the Eucalyptus Cloud/Cluster/Storage Node on
Ubuntu Karmic 9.10 dustinkirkland, www.YouTube.com
[6] Johnson D, Kiran Murari, Murthy Raju, Suseendran RB,
Yogesh Girikumar (2010), Eucalyptus Beginner's Guide -
UEC Edition, CSS Open Source Services, UEC
Guide.v1.0. (Ubuntu Server 10.04 - Lucid Lynx).
Table.2. List of Services [7] Judith H, Robin B, Marcia K, and Dr. Fern H,
Dummies.com, Comparing-Public-Private-and-Hybrid-
cloud- computing. Wiley Publishing, Inc.2009.
8.1 Web Service [8] Kefa Rabah, (2010) Build Your Own Private Cloud Using
A user can access a web page from any computer connected to Ubuntu 10.04 Eucalyptus Enterprise Cloud Computing
Platform v1.2.
the cloud by using Apache web server. Install the Apache web
[9] Mitchell pronsc, (2009) Hybridfox: Elasticfox for
server in the instance and get accessed to the service.
Eucalyptus.
$sudo apt-get install apache2 [10] Partha Saradhi K S (2010), Types of Cloud Computing
8.2 Compiler as a Service Services, Information Security.
[11] Patrcia T Endo, Glauco E Gonalves, Judith K, Djamel S
This service is provided to compile the c++file. Even if the
(2010), A Survey on Open-source Cloud Computing
client doesnt have the compiler, it can be compiled with the Solutions, VIII Workshop em Clouds, Grids e Aplicaes,
compiler available from cloud. The user is sshed to the pp. 3-16.
instance with certain privileges and allowed to compile and see [12] Private cloud, (2008) SearchCloudComputing.com,
the result. Definitions; Whatls.com
$ssh cloud@<IP Address> [13] Simon Wardley, Etienne Goyer & Nick Barcet, (2009),
Installing gnu c++ compiler in an instance: CANONICAL ,Technical White Paper, Ubuntu
$sudo apt-get install build-essential Enterprise Cloud Architecture.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 360
Model [Real-See]
Mostafa Aref 1,Magdy Zakaria 2and Shahenda Sarhan 3
1
Faculty of Computers and Information, Ain-Shams University
Ain-Shams,Cairo,Egypt
2
Faculty of Computers and Information, Mansoura University
Mansoura,Egypt
3
Faculty of Computers and Information, Mansoura University
Mansoura,Egypt
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 361
The better balance you get among economy, technology,
and army, the more chances you have to win.
Although many studies exist on learning to win games
with comparatively small search spaces, few studies exist
on learning to win complex strategy games. Some
researchers argued that agents require sophisticated
representations and reasoning abilities to perform well in
these environments, so they are challenging to construct.
Fortunately, Ponsen and Spronck (2004) [14] developed
a good representation for WARGUS, a moderately
complex RTS game. They also employed a high-level
language for game agent actions to reduce the decision
Fig.1 Aamodt Case-based reasoning cycle [1]
space. Together, these constrain the search space of
useful plans and state-specific sub-plans, allowing them
to focus on the performance task of winning RTS games. 2.2.1 Case-based Reasoning related to RTS
Marthi, Russell, and Latham (2005) [11] applied
hierarchical reinforcement learning (RL) in a limited In this section we will try to summarize some case-
RTS domain. This approach used reinforcement learning based reasoning researches on real-time and/or strategy
augmented with prior knowledge about the high-level games. Some CBR researches has targeted real-time
structure of behavior, constraining the possibilities of the individual games, as Goodmans (1994) [7] projective
learning agent and thus greatly reducing the search visualization for selecting combat actions, and predicting
space. the next action of a human playing Space Invaders.
Ponsen, Muoz-Avila, Spronck and Aha (2006) [12] MAYOR (1996) [6] used a causal model to learn how to
introduced the Evolutionary State-based Tactics reduce the frequency of failed plan executions in
Generator (ESTG), which focuses on the highly complex SimCity, a real-time city management game. Where
learning task of winning complete RTS games and not Ulam et al.s (2004) [17] meta-cognitive approach
only specific restrained scenarios. performs failure-driven plan adaptation for Freeciv
game. They employed substantial domain knowledge,
2.2 Case-based Reasoning
and addressed a gaming sub-task (i.e., defend a city).
Case-based Reasoning (CBR) is a plausible generic Molineaux and Ponsen (2005) [2] relax the assumption
model of an intelligence and cognitive science-based of a fixed adversary, and develop a case-based approach
method by the fact that it is a method for solving that learns to select which tactic to use at each state.
problems by making use of previous, similar situations They implemented this approach in the Case-based
and reusing information and knowledge about such Tactician (CAT). They reported learning curves that
situations. CBR [13] combines a cognitive model demonstrate its performance quickly improves with
describing how people use and reason from past training, even though the adversary is randomly chosen
experience with a technology for finding and presenting for each WARGUS game. CAT is the first case-based
such experience. The processes involved in CBR can be system designed to win against random opponents in a
represented by a schematic cycle as shown in figure (1). RTS game.
Santiago et.al.,(2007) proposed Darmok [15] as the base
1. Retrieval is the process of finding the cases in the
reasoning system, which is a case-based planning system
case-base that most closely match the current
designed to play real-time strategy (RTS) games. In
information known (new case) [1][8].
order to play WARGUS, Darmok learns plans from
2. Reuse is the step where [1] matching cases are
expert demonstrations, and then uses case-based
compared to the new case to form a suggested
planning to play the game reusing the learnt plans.
solution.
3. Revision is the testing of the suggested [8] solution to
In this section, different concepts and topics related to
make sure it is suitable and accurate.
RTS games were explained. All challenges that face RTS
4. Retention is the storage of new cases for future
games were concerned with increasing game intelligence
reuse.
through improving tactics, reinforcement learning, player
satisfaction and modeling opponents. But our concern
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 362
was different; we tried to increase game intelligence not Problems to avoid: is a list of Problems to avoid
through learning but through exchanging experiences Performance is a value in [0, 1], reflects the utility of
between game engines. That we will try to explain in choosing that tactic for that state.
next section. Our case representation concentrates on making case
retrieval more accurate and easier depending first on the
3. Real-Time Strategy Experience case state features then on goal and performance. We
here used the famous Missionaries and Cannibals
Exchanger Model [Real-See] problem as an example of our proposed case
representation as following:
As usual if you want to update any application you just
need to download its update from its web site but what
State = <M, C, B, P>
would you do if your engine of the application is more
State = <3, 3, 1, 2>
updated than the source itself ?!.Usually this cannot
happen in ordinary applications, but here we are talking Where M: no. of missionaries
about RTS games which depend on agents trained by the C: no. of Cannibals
recent RL techniques. This means that they can update B: no. of boats
themselves according to any changes in their P: no. of people a boat can
environment. accommodate at a time
Actions
In this paper we introduce our model that allowed an
RTS game engine to update all other engines with the Move (D1, D2)
game reaction against new surprising un-programmed Return (D1, 0)
opponent scenarios that face the computer player. We Move (S1, S2)
believe this will reveal game players from downloading a Return (S1, D1)
new engine of the game and loosing their saved episodes. Move (S1. S3)
But we first needed to discuss the existing case Return (D2,0)
representations and whether we can use them or we will Move (D2, D1)
need one of our own. Return (D2, 0)
Move (D2, D3)
3.1 Proposed Case Representation Goals: Cross the river
Problems to avoid : Cannibals eat Missionaries
Many case representations are depending on the game or
the researcher point of view. We here tried to make use Performance: Less time to solve the problem equals higher
of the former representations to get a case representation performance.
that suits our model and could be applied in different
RTS games. For example Aha et.al (2005) [2] defined a 3.2 Real-See Model
case C as a four-tuple:
We supposed that n sets of cases from N engines were
C = [BuildingState, Description, Tactic, Performance] sent to the receiver engine figure (2). Each set consists of
Mn cases.
Where we can consider the BuildingState as a part of the
Description. We can also notice that they didnt mention case11 case12 . . . case1m1
the goal of the case while it is an important factor in case
retrieval. From all of this we proposed a case case21 case22 . . . case2m2
n set of cases
representation of our own to use it through our model
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 363
o Hamming distance [10][18]
Case 1 Engine 1
H(P,C) = k (i=1,k)pici (i=1,k)(1-pi)(1-ci) (3)
Case m1
o Absolute distance [18]
Restore Case 1 Engine 2 d(P,C)= | | (4)
Receiver Case
Engine Comparator Case m2
Here we chose to use the absolute distance divided by
Case-
base the feature values range specially that we are dealing
one till it finishes all the M*N cases. The cases that Distance for Symbolic features
didnt have a match in the case-base will be stored in the di(P,C)= 0 if pi = ci (6)
receiver engine case-base and the rest will be deleted.
= 1 otherwise
In Real-See model the case comparator plays the major From equations (5) and (6) we can say that
role as it dose all the job. In the next section we will
discuss the case comparator in details. Sim( pi,ci)= 1- di where 0Sim( pi,ci) 1 (7)
3.2.1 Case Comparator The next step is to calculate feature i weight. The feature
The case comparator compare each received case with weight may be calculated using many ways for example
the cases in the case-base, in order to do that we will the distance inverse but this way will be a problem if the
need to make use of the similarity metrics. If the case feature values were equal which means that the distance
comparator did not found a similar case to the received will be zero. Here we used the inverse of the squared
one it will add it to the case-base but if it found a similar standard deviation; as the standard deviation represents a
one it will act according to the similarity degree. sample of the whole feature values population and is a
measure of how widely values are dispersed from the
Given a received case P, the matching of case P and a average value. In this case of feature values equality the
retrieved case C is guided by the similarity metric in weight is discarded and the feature similarity value will
equation (1). equal 1. We here calculated the weight using equation
k
w i sim((p i , ci ) (1) (8).
i 1 wi= 1/(i)2 (8)
similarity(P, C)
k
wi
i 1
The last step is to calculate case P and case C similarity
using equation (1), and to check its value relating to a
Where wi is the weight of a feature i, sim is the similarity threshold value according to our Real-See algorithm in
function of features, and pi and ci are the values for figure (3).
feature i in the target and retrieved cases respectively.
In figure (3), a received case P is retained as long as its
But before calculating cases P and C similarity, we first similarity value relative to case C is not above . As the
needed to calculate the value of individual features result we get a set Q of retained cases as:
similarity, sim(pi,ci). The feature i similarity of both
cases P and C is related to the distance between them. Q ={P Mn | Sim(P,C) }
Many equations were used to calculate the feature
similarity depending on the distance, for example Where Mn is the received cases and Sim (P, C) denotes
o Euclidian distance [10][18] the degree of similarity of C respect to P. The elements
in Q along with their similarity scores are delivered to
d(P,C)= (2) the receiver engine case-base for to be retained.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org Not Similar MoS 364
(10)
But what happened to the cases its similarity value Goal similarity =
relative to C is above ? Shall we decline them or what? Similar MoS>
Here in our model we tried to make use of the case goal.
If it found a goal match and case P performance is
Real-See Algorithm greater than case C performance, case P will be stored
otherwise case P is declined. But if there was no goal
match case P will be stored. We will explain it clearly in
For i=1 to n the next section with real picked cases.
Do for j=1 to Mn
4. Testing Real-See Model on Real Cases
y
w x sim((Cij) x , C x )
x 1
similarity(Cij, C)
y
wx
x 1
If similarity (Cij,C) Then case stored
Else
If G(Cij) !G(C) Then case stored
Else
If G(Cij)G(C) && P(Cij)>P(C) Then case stored
Else Cij to be discarded
Endif
Fig.4 Glest 3D RTS game
Endif
For more explanation we needed to test the Real-See
Endif algorithm on some real cases. We selected a 3D RTS
game called Glest figure (4) to pick up some cases of it
Fig.3 Real-See algorithm to go one with our algorithm testing.
Example 1: We first chose a stored case called the
Till now similarity metrics depends on the case
three towers (case C) to compare it with a received
description. In our model this means to decline cases
case called defend the castle (case P). In the next five
similar to the retrieved ones. So we tried to apply the
steps we calculated the similarity between the two
similarity metrics on the case goals, if case P similarity
cases using 14 features (table 1) to representing each
value relative to case C is above (=0.5) the case
case.
comparator will compare case P and case C goals
But to calculate the goal similarity we first need to check The first step is to calculate feature i similarity. So
the similarity of its parts. If there is a similarity we can we calculated the absolute distance using
express it by one else by zero. The calculated similarities equations (5) and (7).
is then applied in equation (9)
The second step is to calculate feature i weight
MoS= (9) using equation (8).
Where Bi represents the predicate i of the goal, R is the The third step is to calculate the similarity
number of predicates used in similarity calculation and between case P and C using equation (1).
MoS represents the arithmetic mean of the predicates We can notice from table (2) that the features of value
similarities and we used it as the goal similarity. We can zero in both cases are discarded and were not contributed
then evaluate the mean of similarities (Mos) using in the calculation, as it has no effect on the similarty
equation (10) degree which can finally be calculated as following:
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 365
Table 1: The data set of Three_Towers Table 2: Three_Towers and Destroy_Villag cases similarity calculations
and Destroy_Villag cases erepresenting14 features. Sim wi*
C P di(P,C) wi
Three_Towers Defend the Castle ( pi,ci) Sim( pi,ci)
Features
( Case C) (Case P) 200 500 0.429 0.571 2.222E-05 1.26984E-05
Gold 200 500
200 500 0.429 0.571 2.22222E-05 1.26984E-05
Resources
Worker 0 0
0 0 Discarded Discarded Discarded Discarded
Swordman 0 0
3 2 0.2 0.8 2 1.6
Archer 3 2
0 0 Discarded Discarded Discarded Discarded
Guard 0 0
0 0 Discarded Discarded Discarded Discarded
Cow 0 0
0 1 1 0 2 0
battle_machine 0 1
30 15 0.333 0.667 0.009 0.005925926
Armor 30 15
5 2 0.429 0.571 0.222 0.126984127
Sight value 5 2
Sum 6.732 3.066
Worker 3 1
Similarity (P,C)= 7.226/10.541=0.686 Swordman 1 2
Archer 2 3
Guard 1 2
Cow 0 0
battle_machine 0 0
Armor 30 20
Sight value 5 15
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 366
Table 4: Three_Towers and Tower_of_Souls cases similarity
calculation
wi* After that using equation (9), the MoS value is
C P di(P,C) Sim( pi,ci) wi calculated and then evaluated according to equation
Sim( pi,ci)
200 3000 0.875 0.125 2.551E-07 3.189E-08 (10).
200 300 0.2 0.8 0.0002 0.0002
250 1000 0.6 0.4 3.555E-06 1.422E-06 MoS = , , = 1/3
0 60 1 0 0.0006 0
0 2 1 0 0.5 0 Not Similar MoS
(10)
2 1 0.333 0.667 2 1.333
Goal similarity =
3 1 0.5 0.5 2 1.6 Similar MoS>
1 2 0.333 0.667 2 1.333
2 3 0.2 0.8 2 1.6
Finally from equation (10) we founded out that the
1 2 0 1 2 1.333
three towers case goal is not similar to the
0 0 discarded discarded discarded Discarded tower_of_souls case goal, but as we mentioned before
0 0 discarded discarded discarded Discarded that the three towers case is similar to the
30 20 0.2 0.8 0.02 0.016 tower_of_souls case. So from all the previous and
according to the Real-See algorithm we can conclude
5 15 0.5 0.5 0.02 0.01
that the tower_of_souls case will be stored in the
Sum 10.541 7.226 receiver engine case based.
From table (4) we can see that the Sim(P,C)>0.5 Which Example 3: to test the last case of Real-See algorithm
means that the received case (tower_of_souls) and the the performance value comparison, we used a stored
case called duel and a new received case called
stored one (the three towers ) are so similar and that the
tough_battle in table (6).
recevied case will not be stored in the receiver engine
Table 6: The data set of duel
case-base till the goal and performance similarities and tough_battle cases erepresenting14 features.
according to our algorithm is checked as following.
o The three towers goal is Duel Tough_battle
Features
winner (player):- (Case C) ( Case P)
Objective (destroy_towers), Gold 2000 500
towercount (0).
Resources
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 367
Table 7: Duel and Tough_battle cases similarity calculations Table 8: Goal Similarity Calculation
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 368
Future Work [13] Simpson R., A Computer Model of Case-based
Reasoning in Problem Solving. PhD thesis, Georgea Institute
In the future we plan to pursue several future researches of Technology,1985.
on the case-based situation assessment depending on
Real-See algorithm and whether it helps to enlarge the [14] Ponsen M. and Spronck P., Automatically acquiring
domain knowledge for adaptive AI games using evolutionary
case-based or to shrink it. We also will try to introduce
learning. In Proceedings of the 17th conference on Innovative
an implementation of the Real-See algorithm in both applications of artificial intelligence. Vol.(3) .Pittsburgh,
Glest and Waragus open-source real time strategy games. Pennsylvania, 2004,pp:1535-1540.
References [15] Santiago O., Mishra K., Sugandh N. and Ram A., Case-
[1] Aamodt A. and Plaza E.,Case-Based Reasoning: based planning and execution for real-time strategy games. In
Foundational Issues, Methodological Variations, and System Proceedings of ICCBR-2007, 2007,pp 164-178.
Approaches In Proceedings of AICom - Artificial Intelligence [16] Tracy B.,Game Intelligence AI Plays Along. In
Communications, IOS Press, Vol. 7: 1,1994, pp. 39-59 Proceedings of the Computer Power User. Volume 2, Issue
[2] Aha D., Molineaux M. and Ponsen M., Learning to win: 1,2002, pp 56-60.
Case-based plan selection in a real-time strategy game. In
Proceedings of Sixth International Conference on Case-Based [17] Ulam P., Goel A. & Jones J.,Reflection in Action: Model-
Reasoning , 2005,pp. 5-20. Springer. Based Self-Adaptation in Game Playing Agents. In
Proceedings of D. Fu & J. Orkin (Eds.) Challenges in Game
[3] Balla R. and Fern A., UCT for Tactical Assault Planning Artificial Intelligence: Papers from the AAAI Workshop (TR
in Real-Time Strategy Games. In Proceedings of the 21st WS-04-04). San Jose, CA: AAAI Press,2004.
international JONT conference on Artificial intelligence,
2009,pp.40-45, Pasadena, California, USA. [18] WWW.Wikipedia.org
[8] Hammond K., Case-BasedPlanning: AFramework for Shahenda Sarhan is a PHD student and an assistant lecturer at the
Planning from Experience. In Proceedings of Journal of Computer Science Department in Mansoura University. Her subject is
Cognitive Science. Ablex Publishing, Norwood, NJ. Vol. in Real-Time Strategy games.
14,1994.
[9] Kok, E. Adaptive reinforcement learning agents in RTS
games. University Utrecht, The Netherlands,2008.
[10] Li M., et.al.,The Similarity Metric. In Proceedings of
IEEE Transactions on Information Theory,Aug. 2004.
[11] Marthi B, Russell S., and Latham D.,Writing Stratagus
Playing Agents in Concurrent Alisp. In Proceedings of
Workshop on Reasoning Representation and Learning in
Computer Games (IJCAI-05), 2005,pp. 67-71.
[12] Ponsen M., Muoz-Avila H., Spronck P., and Aha D.,
Automatically Generating Game Tactics via Evolutionary
Learning, Proceedings of AI Magazine, vol.(27),2006,pp.75-
84.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 369
2
Department of Applied Mathematics and Informatics, University of Cadi Ayad, Faculty of Science and Techniques
Marrakech, Morocco
which identify among input parameters whose contribute Generally, factors screening may be useful as a first step
more to variability of 4 output model: Net radiation, latent when dealing with a model containing several no
heat, sensible and soil heat. identified parameters. These parameters have often a
significant effect on the model output. Screening
experiment are used to identify the subset of factors that
3.1 One-Factor-At-A-Time (OAT) method controls most of the output variability with a relatively low
computational effort. This economical method tends to
OAT is the simple technique of Screening Designs (SD) provide qualitative sensitivity measures, (i.e: it ranks the
method to carry out a sensitivity analysis. It consists to input factors in order of importance, but do not quantify
identify most sensitive parameter among those may be how much a given factor is more important than another.
affecting model output (Nearing et al., 1990). SD is
efficient when a model has several input parameter
(Jolicoeur, 2002). To assess the impact of errors or 4. Results and discussion
variation
10% around base input value, a sensitivity analysis of
TSEB model was performed by computing relative 4.1 Overview
variation rate Vr(p) and sensitivity index SI(p). The effect
of each operated modification is analyzed on 4 outputs of The input parameters used in this sensitivity analysis are
the model (i.e: Net radiation, latent heat, sensible heat and the Priestly-Taylor constant (p), the leaf area index
soil heat), using variation rate and sensitivity index. (LAI), the fraction of the LAI that is green (fg), the
The relative variation rate Vr(p), and sensitivity index, fraction of the soil net radiation (cg), the canopy height (h),
SI(p) of a model flux estimate, in a parameter p, can be the mean leaf size (s) is given by four times the leaf area
expressed as divided by the perimeter, the surface emissivity (), and
the surface albedo (). After modifying alternately each
model input of datasets mentioned above by -10%
and+10% around its initial value, we analysis only
percentage greater than 0.5%. Such inaccuracies can be
derived either from some variability inherent in any
consideration or measurement on field. A total of 6983
simulation is performed on the semi-hourly data set
obtained from SUDMED Project (The fall year 2003).
where SI is the sensitivity index of model output ; E1 the Each simulation performed here takes into account the
initial input parameter ; E2 the tested input value change only one input relative to the overall model
(e.g :10% modification lag); Emoy average between E1 parameters. The effect of each change made is analyzed in
and E2; S1, S2 are respectively the outputs corresponding the four model outputs (i.e: Sensible heat (H), Latent heat
to E1 and E2; (LE), Net radiation (Rn) and Ground conduction heat (G)).
Smoy is the average between S1 and S2.
4.2 Sensitivity of sensible heat (H)
This index provides a quantitative basis for expressing the
sensitivity of model outputs versus the input variables. A Input parameters modification produce variation rate from
sensitivity index equal to unity indicates that the rate of 0.7% to 32.6% on sensible heat. LAI, p and fg are the
variation of a given parameter causes the same rate at the most sensitive parameter on this output (fig.1). They
outputs, but a negative value indicates that the inputs and produce variation respectively of 32.59%, 23.55% and
outputs vary in opposite directions. The index in absolute 23.55%. Sensible heat accuse sensitivity index
value is greater then its impact of a given parameter which respectively of -3.4 to -2. It is most sensitive to LAI with -
might have on a specific output. 3.4 as negative sensitivity index. This analysis indicates
The model outputs are treated as follows: that high uncertainties on these inputs may falsify
1- In fact, the change of each input variable by 10% seriously results of sensible heat. Indeed, its clear that
produces two values for each selected outputs. From these when vegetation is developing then LAI is increasing and
two introduced input values, the greatest variation at a the sensible heat is decreasing (i.e: negative sensitivity
given output is used to calculate its sensitivity index (SI). index) because vegetation play a role of shock-absorber.
2- A percentage change (Favis-Mortlock, Smith, 1990) Therefore vegetation play a role of shock-absorber, then
and a sensitivity index (Jolicoeur, 2002) are calculated for reduce considerably soil sensible heat with variation rate
each output selected above by 100% (SI=-21) and also soil heat stock (14.4% with SI=-
previous formulas: 1.28 (fig.1)). However, this case is occurred during
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 371
development phase of olive trees (e.g: during July, August, is lower and the higher the ground stock heat decreases. In
and September). That is why LAI is related strongly to fact, it seems natural that the LAI has this influence on the
development phase and has an important influencing in stock to heat in the soil because it is one of the main
sensible heat especially its soil component. parameters that control the level of heat storage in the soil.
For the case of the olive, LAI dont vary too much during Uncertainty on this entry could have some imprecision on
seasons. Sensible heat is also sensitive to fg and p with G which unfortunately is poorly estimated by the model.
23.59% of variation (SI=-2). These parameters reduce
considerably canopy sensible heat. fg represents the green
fraction of vegetation and its increasing play in the
opposite direction to total sensible heat especially in the
soil contribution.
4.6 Comparison of changes in TSEB surface fluxes
4.3 Sensitivity of sensible heat (LE)
An average variation determined for the 4 outputs
Figure 2 indicate that LAI, fg, and p are the important considered and for each entry shows that LAI is the most
input for latent heat. LAI produce a variation rate of important parameter with an average change produced
8.13%, fg and p are 6.67% with sensitivity index approximately 18.4%. It is followed by p and fg whose
respectively of 0.74 and 0.65 for input. We observe that variations are 15.1%. Globally changes in other inputs
sensitivity index is negative for emissivity, albedo, cg and have little influence on model outputs (Fig. 5). Comparing
s. It means that these parameters vary inversely to total the results of the sensitivity analysis obtained shows a
latent heat input. Note well that LAI is also the most certain similarity in the sensitivity of the four outputs
sensitive factor on output. We have the same selected with the variation of model inputs of 10% from
ascertainment then for total sensible heat varies inversely. their initial value.
On TSEB, LAI play an important role in fractional cover
vegetation. Its sensitivity index is positive then it confirm
a good influence in evapotranspiration and evolves both in 5. Conclusions and perspectives
the same direction. However, any doubt measurements or
uncertainties in LAI index cause some errors in latent heat. The sensitivity analysis of TSEB model has been applied
Moreover, fg and p are the same influencing in using One-Factor-At-A-Time (OAT) which is a typical
evapotranspiration like LAI. screening designs to assess all constant parameter effect
on model result and to classify them according to their
sensitivity level. Although simple, easy to implement and
4.4 Sensitivity of net radiation (Rn)
computationally cheap, the OAT methods have a
Net radiation undergoes only the both influence of surface limitation in that they do not enable estimation of
emissivity and albedo having variation rate respectively of interactions among factors and usually provide a
2.9% and 1.6% with negative sensitivity as -0.29 and - sensitivity measure that is local. Input parameters used in
0.15. It indicates that these parameters evolve inversely this sensitivity analysis are the Priestly-Taylor constant
effect to net radiation. Net radiation depends also on (p), the leaf area index (LAI), the fraction of the LAI that
climatic variables as long wave, short wave and is green (fg), the fraction of the soil net radiation (cg), the
radiometric temperature. However, inaccuracies intricate canopy height (h), the mean leaf size (s), the surface
always on this output, cause errors can occur on these two emissivity (), and the surface albedo (). The input
parameters. In effect an uncertainty of 10% on albedo and parameters data such as LAI, p, and fg are successively
emissivity cause only a variation of 1 to 3% at the outlet (18.4%, 15.1%, and 15.1%) shown to have the greatest
(Fig.3). impact on the TSEB estimate of the fluxes.
As a result, the sensitivity of the TSEB model output in H
4.5 Sensitivity of soil conduction heat (G) to uncertainties in LAI, p and fg dont exceeded 33% of
its reference value. On the other hand, sensitivity of the
Entries LAI, and affect G respectively with a variation TSEB model output in LE to these parameters
rate of 14.4%, 2.9% and 1.6% with negative sensitivity uncertainties was generally less than 8% and not
indices as respectively -1.28, -0.29 and -0.16 (Fig.4). LAI influencing Rn and G except for LAI which have 14% of
is the most influential parameter on G as it is normal and uncertainties to G.
consistent with what we saw previously, because the index The results of a sensitivity analysis should be handled with
indicates the leaf area cover and play a role of shock- care, since the apparent sensitivity of a model for a given
absorber. The sensitivity is negative, then it means more parameter depends on the importance, during the chosen
vegetation is growing the radiation received by the ground period, the process that affects this parameter, itself linked
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 372
to environmental constraints and to the initial conditions. The estimation of soil net radiation, Rns can be obtained
Thus, in this study, the results obtained give a fairly clear by
idea of the most important entrances of TSEB. They can
guide the user through the calibration process and also in Rns = Rn exp(-Ks LAI / ) (A.6)
collecting experimental data.
where ks is a constant ranging between 0.4 to 0.6 and
is the zenithal solar angle.
Appendix A
f() = 1 exp(-0.5 LAI / cos()) (A.2) where p is the Priestly-Taylor constant, which is initially
set to 1.26 (Norman et al 1995; Agam et al 2010), fg is
The simple fractional cover (fc) is as follows: the fraction of the LAI that is green, is the slope of
saturation vapor pressure versus temperature curve, is
fc = 1 exp (-0.5 LAI) (A.3) the psychrometer constant (e.g: 0.066 kPa C- ). If no
information is available on fg, then it is assumed to be
LAI is the leaf area index, and the fraction of LAI that is near unity. As will become apparent later (A.9) is only an
green (fg) is required as an input and may be obtained initial approximation of canopy latent heat.
from knowledge of the phenology of the vegetation. If in any case LEc 0, then LEc is set to zero (i.e: no
condensation under daytime convective conditions)
The total net radiation Rn (Wm-) is The sum of the contribution of the soil and canopy net
radiation, total latent and sensible heat is according to the
Rn = H + LE + G (A.4) following equations
where H (Wm-) is the sensible heat flux, LE (Wm-) is
the latent heat, and G (Wm-) is the soil heat flux. The Rns= Hs + LEs + G (A.10)
estimation of total net radiation, Rn can be obtained by
computing the net available energy considering the rate Rnc= Hc + LEc (A.11)
lost by surface reflection in the short wave (0.3/2.5m)
and emitted in the long wave (6/100m): LEt = LEc+ LEs
(A.12)
Rn = (1- s).SW + s.LW s..Trad4 (A.5) Where the subscript s and c designs soil and canopy.
The TSEB model considers also the contributions from the
where SW (Wm-) is the global incoming solar radiation, soil and canopy separately and it uses a few additional
LW (Wm-) is the terrestrial infrared radiation, s is the parameters to solve for the total sensible heat Ht which is
surface albedo, s is the surface emissivity, is the Stefan- the sum of the contribution of the soil Hs and of the
Boltzmann constant, Trad (K) is the radiometric surface canopy Hc according to the following equations
temperature.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 373
(A.14)
(A.20)
Where Ua is the wind speed and is the diabatic
(A.15) correction for momentum.
Where (Kg.m-3) is the air density, Cp is the specific heat The Rs (sm-1) is the soil resistance to the heat transfer
of air (JKg-1 K-1), Ta (K) is the air temperature at certain (Goudrian, 1977; Norman et al 1995; Sauer et al 1995;
reference height, which satisfies the bulk resistance Kustas et al, 1999), between the soil surface and a height
formulation for sensible heat transport (Kustas et al, 2007). representing the canopy, and then a reasonable simplified
Ra (sm-) is the aerodynamic resistance to heat transport equation is:
across the temperature difference that can be evaluated by
the following equation (Brutsaert, 1982):
(A.21)
(A.18) (A.23)
The term is dimensionless variable relating observation The mean leaf size (s) is given by four times the leaf area
height Z, to Monin-Obukhov stability Lmo. divided by the perimeter.
Lmo is approximately the height at which aerodynamic is the wind speed at the top of the canopy, given by:
shear, or mechanical, energy is equal to buoyancy energy
(i.e: convection caused by an air density gradient). It is
determined from
(A.24)
The TSEB model is run with the use of ground thermal To complete the solution of the soil heat flux components,
remote sensing and meteorological data of Agdal site the ground stock heat flux can be computed as a fraction
during 2003. Some model constant parameters are of net radiation at the soil surface (A.8).
supposed invariable along time such as the Priestly-Taylor Applying energy balance for the two source flux
constant p, albedo, emissivity, leaf area index (LAI), the components resolves the surface fluxes, which cannot be
fraction of the LAI that is green (fg) , leaf size (s), the reached directly because of the interdependence between
vegetation height and a constant fraction (cg) of the net atmospheric stability corrections, near surface wind speeds,
radiation at the soil surface. These considerations are and surface resistances (A.16-17). In these equations, the
certainly some consequences on model results according and H depend upon the
stability correction factors
to seasons. The Priestly-Taylor constant p is fixed to surface energy flux components H and LE via the Monin-
1.26 (McNaughton and Spriggs 1987). The albedo, value Obukhov roughness length Lmo.
of 0.11 is an annual averaged measured with CNR1, and a TSEB computation for solving the surface energy balance
surface emissivity of 0.98, the leaf area index (LAI) is by ten primary unknowns and ten associated equations
equal to 3 (Ezzahar et al, 2007). The fraction of LAI (fg) (Table.1), needs an iterative solution process by setting a
that is green is fixed to 90% of vegetation (i.e: 10% of large negative value to Lmo (i.e: in highly unstable
vegetation could be considered no active). The mean leaf atmospheric conditions). This permits an initial set of
size (s), is given by four times the leaf area divided by the stability correction factors M and H to be computed.
perimeter (s=0.01). The average height of the olive trees is Computed iteration is repeated until Lmo converges.
6 meters. The fraction of the net radiation at the soil
surface is fixed to cg=0.35.
Sensible and latent heat flux components for soil and Acknowledgments
vegetation are computed by TSEB , only in the
atmospheric surface layer instability. Note that the storage This study is considered within the framework of research
of heat within the canopy and energy for photosynthesis between the University of Cadi Ayad Gueliz, Marrakech,
are considered negligible for the instantaneous Morocco, and the Department of National Service of
measurements. The total computed heat flux components Meteorology, Morocco (DMN, Morocco). The first author
are then from equations (A.5-8). is very grateful for encouragement to all his family
The canopy heat fluxes are solved by first estimating the especially to Mrs F. Bent Ahmed his mother, Mrs K.Aglou
canopy latent heat flux from the Priestley-Taylor relation his wife and Mr Mustapha.Mouida. Finally the authors
(A.9), which provides an initial estimation of the canopy gratefully acknowledge evaluation and judgments by
fluxes, and can be overridden if vegetation is under stress reviewers, and the editor.
(Norman et al., 1995). Outside the positive latent heat
situation, two cases of stress occur, when the computed References
value for canopy (LEc) or soil (LEs) latent heat become [1] Agam et al, "Application of the Priestley-Taylor Approach in
negative which are an unrealistic conditions. Two Source Surface Energy Balance Model", Am Meteo Soc,
In the first case, the normal evaluation procedure is Journal of Hydrometeorology, Volume 11, 2010, pp. 185-
overridden by setting (LEc) to zero and the remaining 198.
flux components are balanced by (A. 1-10-11-13-15). But [2] Becker. F, and Li. Z.L, "Temperature independent spectral
indices in thermal infrared bands" Remote Sensing of
in the second case, (LEs) is recomputed by using specific
Environment, vol. 32, 1990, pp. 17-33.
soil Bowen Ratio determined by =Hs/LEs and flux [3] Brutsaert, W, Evaporation Into The Atmosphere, D. Reidel,
components are next balanced by (A.1-10-11-13-15). Dordrecht, 1982.
In order to solve (A.15) additional computations are [4] Choudhury, B.J, Idso, S.B, and Reginato, R.J, " Analysis of
needed to determine soil temperature, and the resistance an empirical model for soil heat flux under a growing wheat
terms Rah and Rs but as will become apparent, they must crop for estimating evaporation by an infrared-temperature
be solved iteratively. Soil temperature is determined from based energy balance equation", Agric. For. Meteorol, Vol.
two equations: one to relate the observed radiometric 39, pp. 283-297.
temperature to the soil and vegetation canopy temperature, [5] Campbell, G. S, and Norman, J. M, An Introduction to
and another to determine the vegetation canopy Environmental Biophysics, (2nd ed.): New York: Springer-
Verlag. 286 pp. 1998.
temperature. The composite temperature is related to soil [6] Ezzahar.J, " Spatialisation des flux dnergie et de masse
and canopy temperatures by (A.1). The resistance linterface Biosphre-Atmosphre dans les rgions semi-
components are determined from (A.16), for Rah and the arides en utilisant la mthode de scintillation ", Ph.D. thesis,
following equation (Sauer et al., 1995) for Rs (A.18). University of Cadi Ayyad. Marrakech, Morocco, 2007.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 375
[7] Favis, Mortlock DT, Smith FR, "A sensitivity analysis of [24] Shuttleworth. W.J, and Wallace. J.S, "Evaporation from
EPIC ", Documentation. U.S. Department of Agriculture. sparse canopies-an energy combination theory", Q. J. R.
Agric. Tech. Bull. 1768, 1990, pp. 178190. Meteorol. Sot., 111, 1985, pp. 839-855.
[8] Garratt et al, "Momentum, heat and water vapor transfer to
and from natural and artificial surfaces ", Q. J. R. Meteorol.
Sot, 99. pp. 680-687. First Author Engineer in meteorology since 1986-2004, Chief
[9] Goudriaan, J, "Crop Micrometeorology: A Simulation Study Engineer in meteorology 2004-2011, and Chief Operating
", Center for Agricultural Publications and Documentation, Meteorological Service 2000-2011, current research is about
estimation of fire forest risk using water stress mapping and
Wageningen, 1977. meteorological data.
[10] Jacob. F et al, "Using airborne vis-NIR-TIR data and a
surface energy balance model to map evapotranspiration at
high spatial resolution", In Remote sensing and hydrology Second Author received his Master of Science and his Ph.D.
degrees from the University of NancyI France respectively in 1986
IAHS-AISH, 2000. and 1989. In 2006, he received the HDR in Applied Mathematics
[11] Jolicoeur, "Screening designs sensitivity of a nitrate from the University of Cadi Ayyad, Morocco. He is currently
leaching model (ANIMO) using a one-at-a-time method", Professor of modeling and scienti.c computing at the Faculty of
USA: State University of New York at Binghampton, 14 p. Sciences and Technology of Marrakech. His research is geared
2002. towards non-linear mathematical models and their analysis and
[12] Kustas et al, "A Two-Source Energy Balance Approach digital processing applications.
Using Directional Radiometric Temperature Observations for
Sparse Canopy Covered Surfaces", Agronomy Journal, 92, Figures
1999, pp. 847-854.
[13] Kustas et al, "Utility of radiometric-aerodynammic Fig.1: Parameters influencing on Sensible heat
temperature relations for heat flux estimation", Bound.-Lay.
Meteorol, 122, pp.167187, 2007.
[14] McNaughton. K. G, and T. W. Spriggs, "An evaluation of
the Piestley and Taylor equation and the complimentary
relationship using results from a mixed-layer model of the
convective boundary layer", T. A. Black, D. L, 1987, pp.89-
104.
[15] Nearing AM, Deer- Ascough LA, Laflen JM, "Sensitivity
analysis of the WEPP hillslope profile erosion model". Trans.
ASAE 33 (3), 1990, p p. 839849.
[16] Norman L, J. M, Kustas, W. P, and Humes, K. S. "A two-
source approach for estimating soil and vegetation energy
fluxes in observations of directional radiometric surface
temperature", Agric. For. Meteorol, pp.77, 263-293.
[17] Norman et al, "Source approach for estimating soil and
vegetation energy fluxes in observations of directional
radiometric surface temperature", Agricultural and Forest
Meteorology 77, 1995, pp. 263-293
[18] Paulson, C.A, "The mathematical representation of wind
speed and temperature profiles in the unstable atmospheric Legend : Cg: Fraction of the soil net
surface layer", J. Appl. Meteorol, 9, 1970, pp. 857-861. SIH: Sensitivity Index of sensible radiation
[19] Priestley, C. H. B, and Taylor. R. J, "On the assessment of
heat Height: Canopy height
surface heat flux and evaporation using large-scale
parameters", Mon. Weather Rev, 100, 1972, pp. 81-92. VH: Variation rate of sensible heat S: Mean leaf size
p: Priestly-Taylor constant : Surface emissivity
[20] Rody Flix, Dimitri Xanthoulis, "Analyse de sensibilit du
LAI: Leaf area index : Surface albedo
modle mathmatique Erosion Productivity Impact
Calculator (EPIC) par lapproche One-Factor-At-A-Time fg: Fraction of the LAI that is green
(OAT) " 2005.
[21] Ratto M, Lodi G, Costa P, "Sensivity analysis of a fixed bed
gas-solid TSA: the problem of design with uncertain
models", Sep. Technol, 6, 1996, pp. 235245.
[22] Saltelli et al, "Sensitivity Analysis", New York, John Wiley
& Sons publishers, 2000.
[23] Sauer et al, "Measurement of heat and vapor transfer at the
soil surface beneath a maize canopy using source plates",
Agric. For. Meteorol, 75, 1995, pp. 161-189.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 376
Legend : Cg: Fraction of the soil net Table.1: 11 Unknowns Variables of TSEB Model and associated
p: Priestly-Taylor constant radiation formulae
LAI: Leaf area index Height: Canopy height
fg: Fraction of the LAI that is S: Mean leaf size Unknown variable Formula
green : Surface emissivity Rn Rn = (1- s).SW + s.LW s..Trad4
: Surface albedo Rns Rns = Rn exp(0.9 ln(1-fc))
Rnc Rnc= Rn- Rns
G G = cg Rns
Hc Hc = Rnc - LEc
Hs
LEc
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 378
2
Centre for Advanced Studies in Engineering (CASE),
Islamabad, Pakistan.
Abstract. rotor poles align with the excited stator poles. Due to this
This paper presents a novel scheme for speed regulation/tracking particular nature of torque production, the phase torque is
of Switched Reluctance (SR) motors based on Higher-Order Slid- independent of the polarity of phase current and depends
ing-Mode technique. In particular, a Second-Order Sliding-Mode only upon the relative position of the rotor poles with re-
Controller (SOSMC) based on Super Twisting algorithm is devel-
spect to the excited phase poles. For this reason, low cost
oped. Owing to the peculiar structural properties of SRM, torque
produced by each motor phase is a function of phase current as
unipolar power converters are used to drive SR motors.
well as rotor position. More importantly, unlike many other mo- This fact also leads to a very important feature peculiar to
tors the polarity of the phase torque in SR motors is solely deter- this motor, i.e. unlike most of the other types of electrical
mined by the rotor position and is independent of the polarity of motors, not all the phases of SR motor can produce the
the applied voltage or phase current. The proposed controller torque of the same polarity at any given rotor position. For
takes advantage of this property and incorporates a commutation example, in a 3-phase SR motor, there are certain rotor po-
scheme which, at any time instant, selects only those motor phases sitions where only one phase can contribute the torque of
for the computation of control law, which can contribute torque of the desired polarity whereas the torques produced by the
the desired polarity at that instant. This feature helps in achieving
other two phases are of opposite polarity. Thus energizing
the desired speed regulation/tracking objective in a power effi-
cient manner as control efforts are applied through selective
all 3-phases would lead to reduction in the net motor torque
phases and counterproductive phases are left un-energized. This because of the cancellation among the phase torques.
approach also minimizes the power loss in the motor windings
thus reducing the heat generation within the motor. In order to SR motors are usually operated in magnetic saturation to
highlight the advantages of Higher-Order Sliding-Mode control- increase its output torque. Magnetic saturation and me-
lers, a classical First-Order Sliding-Mode controller (FOSMC) is chanical saliencies in SR motors make phase torque a
also developed and applied to the same system. The comparison highly non-linear function of phase current and rotor posi-
of the two schemes shows much reduced chattering in case of tion. Due to advancements in control theory, many nonlin-
SOSMC. The performance of the proposed SOSMC controller for
ear control techniques such as artificial neural network,
speed regulation is also compared with that of another sliding
mode speed controller published in the literature.
feedback linearization, sliding mode, back stepping, fuzzy
Keywords SR motor, sliding mode control, higher order sliding logic, etc. have been explored in the literature for the con-
mode control, commutation, speed regulation/tracking control trol of SR motors. Hajatipour and Farrokhi [3] developed
an adaptive intelligent control based on Lyapunov func-
tions. The proposed technique consists of two components;
1. Introduction the first one approximates the load-torque, error in the mo-
ment of inertia and the coefficient of friction, the second
Switched reluctance motors have received considerable component drives the system output to track the desired
attention among the researchers due to its simple construc- value. The speed controller does not require exact motor
tion, rugged mechanical structure, and low cost driver elec- parameters and is shown to be robust against disturbances
tronics. Because of the absence of any windings on the and uncertainties. Neural network torque estimator is used
rotor, SR motor is very suitable for operations at high speed as a second controller in the proposed technique for torque
and/or at high temperatures [2]. SR motor is doubly salient ripple reduction. In [4], artificial neural network technique
machine, i.e. both stator and rotor have salient poles on was also adopted in designing the speed controller of SR
their laminations. Torque is developed in the motor when motor for regulation problem. The performance of the pro-
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 379
, Lemma 1: The following control law will stabilize the mo-
(10)
tor speed to its desired value when .
, ,
(17)
(11) Proof: Substituting Eq. (14) in Eq. (8), the following ex-
Substituting (3) into (11) leads to: pression is obtained.
(18)
Now plugging in Eq. (17) in Eq. (18), we get
, ,
(19)
Then 0 (20)
, ,
As it is clear from Eq. (20) that 0 only when 0.
This ensures that the control law as defined in Eq. (17)
(12)
would guarantee that when
Which can be written in the following form suitable for the
design of our proposed controllers discussed in the follow-
4.2 Case-2: Tracking Problem
ing sections:
The aim of tracking problem is to follow the time varying
, , , reference signal minimizing the tracking error. To prove
that the proposed control law will track the reference sig-
, nal, consider the Lemma 2.
Lemma 2: The following control law will ensure that the
, ,
(13) speed will follow a time varying reference signal as .
A schematic of driver electronics used to drive the motor scheme are elaborated and compared to the conventional
phases is shown in Fig. 4, which uses only one leg of the H- control; the latter can also be seen in [10] and [23].
bridge as our proposed controllers require only positive
phase voltage. The FOSMC and SOSMC designed in Sec-
tion-4 along with the commutation scheme developed in 0.03
Error
[10] is also implemented. Simulation results are presented 0.01
-0.005 SOSMC
11 FOSMC
-0.01
10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Time (sec)
9
7
SOSMC Fig 8: A Close-up View of Error plot of Speed Response for FOSMC
FOSMC
Speed (rad /sec)
6
Conventional Control and SOSMC to a step command. The reduced amount of error magni-
5 tude is clearly visible
4 Speed
3
20
2
1
15
9.985 10 SOSM C
Speed (rad/sec)
FOSM C
9.98 5
9.975 0
0 1 2 3 4 5 6 7 8 9 10
9.97 Time ( sec )
9.965
SOSMC
Fig 9: Speed Response and error plot of FOSMC and SOSMC to a step
FOSMC command for a reference speed of 20 rad/s.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 .
Time (sec)
Speed Close up
Fig 6: A Close-up View of Response of both FOSMC and SOSMC to a
step command. The high magnitude of chattering signal of FOSMC is 20
clearly noticeable.
Transient
19.8
SOSM C
10 FOSM C
SOSMC 19.6
9
FOSMC
8 0.5 1 1.5
Time ( sec )
7
5
Error
20
4
Steady state
3
19.8 SOSM C
2
FOSM C
1 19.6
0
6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7
-1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time ( sec )
Time (sec)
Fig 10: A Close-up View of Response of both FOSMC and SOSMC to
Fig 7: Error plot of Speed Response of FOSMC and SOSMC to a step a step command in the starting and steady state regions.. The high
command magnitude of chattering signal of FOSMC is clearly noticeable.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 384
Conventional
controllers developed in this paper with that of another slid-
Control
5
ing mode controller reported in [10] for the case when mo- 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
tor is commanded to move at 10 rad/s from rest. As it is 4
Time (sec)
FOSMC
tor speed more quickly to the desired value. Fig. 7 and Fig.
5
SOSMC
5
Conventional
200
Control
0
loss, i.e. only 19 KW. -200
0.4 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5
Time (sec)
FOSMC
tation scheme employed in FOSMC. SOSMC with its re- 100
hind this power savings (area under the curve, which is less SOSMC
100
for SOSMC).
0
0.4 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5
Time (sec)
Figure 12: 3-phase voltages during initial stage of steady state re-
Fig. 12 shows the three phase voltages during initial stage sponse.
of steady state operation. It is well clear from these figures Currents
that in commutation based controllers; only one or two mo- 4
Conventional
0
-2
-4
any given instant of time. The conventional design, on the 0.4 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5
Time ( sec )
other hand, energizes all the three phases simultaneously 4
and applies bipolar voltages to motor phases. A closer fo-
FOSMC
Despite that maximum voltages are being applied to the Time ( sec )
two phases resulting in large phase currents, the torques Figure 13: 3-phase currents during initial stage of steady state re-
produced by the two phases are cancelling each other. This sponse.
results in much reduced net motor torque as compared to 10
Torques
Conventional
Net Torque
5
unipolar voltages with reduced voltage levels thus resulting 0.44 0.441 0.442 0.443 0.444 0.445
Time ( sec )
0.446 0.447 0.448 0.449 0.45
Tracking performance of FOSMC and SOSMC can be re- The natural frequency of actuator and the frequency
flected from Fig. 15 where sinusoidal signal is selected for and magnitude of chattering, etc.
comparison test. It can be seen that SOSMC is exhibiting
less amount of chattering and smaller spikes than FOSMC.
Another good performance of SOSMC can be shown from 7. Conclusion
Fig. 16 when SR motor experiences a sudden change in
external load driven by the motor. The external load varies First-order and Second-order sliding-mode controllers have
from 0 to 1.5 Nm, 0 to 2 Nm and 0 to 2.5 Nm during the been developed for speed (regulation/tracking) control of
intervals t=3 to t=3.1 second, t=5 to t=5.1 second and t=7 to SR motors. Contrary to the conventional sliding-mode con-
t=7.1 second respectively. It can be seen that despite a sud- trollers developed for SR motors , the proposed controllers
den change in external load, the SOSMC does not allow a use a designed commutation scheme which uses only those
bigger dip and keeps the motor closer to its desired speed. motor phases for the computation of control law, at any
The results of these simulations clearly indicate that com- given instant, which can produce torque of the desired po-
mutation scheme based sliding-mode controllers developed larity. Second-order sliding-mode controller (SOSMC) is
in this paper show promising results. These results are good shown to be more effective in terms of accuracy and re-
enough to establish the fidelity of both designs in tracking duced amount of chattering than First-order sliding-mode
as well as regulation applications. A selection out of these controller (FOSMC). Both the controllers are shown to be
two schemes would depend upon a number of factors, some power efficient and also result in reduced power loss in
of which are highlighted below: motor windings leading to reduced heat generation.
The magnitude of error a designer can safely tolerate.
The effect of chattering on the actuator action. Acknowledgement
The actuator safety while dealing with chattering in the
actuation signal. This research was financially supported by Higher Educa-
tion Commission (HEC) Pakistan.
Speed
10
References
[1] M. Rafiq, S.U Rehman, F. R. Rehman, Q.A. Butt, Perfor-
rad / sec
0 SOSM C
FOSM C
Ref. signal mance Comparison of PI and Sliding Mode for Speed Con-
-10
trol Applications of SR Motor, European Journal of Scien-
0 1 2 3 4 5 6 7 8 9 10 tific Research, Vol. 50, No. 3, pp. 368-384, 2011.
Sp eed error
[2] R. Krishnan, Switched Reluctance Motor Drives: Modeling,
Simulation, Analysis, Design, and Applications, Industrial
10
SOSMC
[3] M. Hajatipour, M. Farrokhi, Adaptive intelligent speed con-
-10
FOSMC trol of switched reluctance motors with torque ripple reduc-
0 1 2 3 4 5
Time ( sec )
6 7 8 9 10
tion, Energy Conversion and Management, Vol. 49, No. 5,
pp. 1028-1038, 2008.
Fig. 15 Speed response of proposed controllers while tracking a [4] E. Karakas, S. Vardarbasi, Speed control of SR motor by
reference signal given by 15 sin 2 . The lower plot self-tuning fuzzy PI controller with artificial neural network
shows a close up to elaborate the performance of both controllers. in Sadhana Academy Proceedings in Engineering
Sudden change in Torque Load Sciences, Vol. 32, No. 5, pp. 587596, 2007.
20
[5] V. I. Utkin, H. C. Chang,Sliding Mode Control on Electro-
Speed ( rad / sec )
15
SOSMC
Mechanical Systems, Mathematical Problems in Engineer-
10 FOSMC
2.5 N-m
ing, Vol. 5, No. (4-5), pp. 451-473, 2002.
1.5 N-m 2.0 N-m
5 [6] G. John, A.R. Eastham, Speed control of switched reluc-
0
0 1 2 3 4 5 6 7 8 9 10
tance motor using sliding mode control strategy in Proc.
Time ( sec ) IEEE, 13th IAS, Industry Application Conference, 1995, Vol.
Sudden change in Torque Load Close up
1, 1995, pp. 263 270.
A. Forrai, Z. Biro, V. Chiorean, Sliding Mode Control of
20
[7]
Speed ( rad / sec )
[9] I. Nihat, O. Veysel, Torque ripple minimization of a [24] M. Rafiq, S. A. Rehman, Q. R. Butt, A. I. Bhatti, Power
switched reluctance motor by using continuous sliding mode Efficient Sliding Mode Control of SR Motor for Speed Con-
control technique, Electric Power Systems Research, Vol. trol Applications, in Proc. IEEE , 13th INMIC. pp. 1-6, 2009.
66, No. 3, pp. 241-251, 2003. [25] J. J. E. Slotine, L. Weiping 1991, Applied Nonlinear Con-
[10] M. T. Alrifai, M. Zribi, H. S. Ramirez, Static and dynamic trol, Prentice Hall, Englewood Cliffs, New Jersey 07632
sliding mode control of variable reluctance motors, Interna- [26] A. Levant, Chattering Analysis, IEEE Transactions on
tional Journal of Control, Vol. 77, pp. 11711188, 2004. Automatic Control, Vol. 55, No.6, pp. 1380-1389, 2010.
[11] H. K. Chiang, C. H. Tseng ,W. L. Hsu, Implementation Of [27] L. Derafa, L. Fridman, A. Benallegue and A. Ouldali, Super
A Sliding Mode Controller For Synchronous Reluctance Mo- Twisting Control Algorithm for the Four Rotors Helicopter
tor Drive Considering Core Losses, Journal of the Chinese Attitude Tracking Problem, in Proc. IEEE, 11th International
Institute of Engineers, Vol. 26, No. 1, pp. 81-86, 2003. Workshop on Variable Structure Systems, 2010, pp. 62-67.
[12] A. Tahour, A. Meroufel, H. Abid, A. A. Ghani, Sliding [28] M. Rolink, T. Boukhobza, D. Sauter, High Order Sliding
Controller of Switched Reluctance Motor, Leonardo Elec- Mode Observer For Fault Actuator Estimation And Its Ap-
tronic Journal of Practices and Technologies, Vol. 12, pp. plication to The Three Tanks Benchmark, in Proc. HAL ,
151-162, 2008. Vol. 1, 2006, pp. 1-7.
[13] A. Tahour, H. Abid, A. A. Ghani, Speed Control of Switch- [29] H. Chaal, M. Jovanovic, Second Order Sliding Mode Con-
ed Reluctance Motor Using Fuzzy Sliding Mode, Advances trol of a DC Drive with Uncertain Parameters and Load Con-
in Electrical and . & Computer Engineering, Vol. 8, No.1, ditions, in Proc. IEEE, Conference on Control and Decision,
pp.21-25, 2008. 2010, pp. 3204-3208.
[14] C. A. Chen, H. K. Chiang, W. B. Lin, C.H. Tseng, Syn- [30] M. Ezzat, J. D. Leon, N. Gonzalez, A Glumineau, Sensor-
chronous reluctance motor speed drive using sliding mode less Speed Control of Permanent Magnet Synchronous Motor
controller based on Gaussian radial basis function neural using Sliding Mode Observer, Proc. IEEE, 11th International
network, in Proc. 14th International Symposium on Artificial Workshop on Variable Structure Systems, 2010, pp. 227-232.
Life and Robotics, Vol. 14, 2009, pp. 53-57.
[15] A. Levant, Higher order sliding modes and their application
for controlling uncertain processes, PhD Dissertation, 1987
[16] Q. R. Butt, A I. Bhatti,Estimation of Gasoline-Engine Pa-
rameters Using Higher Order Sliding Mode IEEE Transac- Muhammad Rafiq Mufti received M.Sc. degree
tion on Ind. Electronics, Vol.55, No.11, pp.3891-3898, 2008. in computer science from Bahauddin Zakariya
[17] S.H. Qaiser, A.I. Bhatti, Masood Iqbal, R. Samar, J. Qadir, University, Multan, Pakistan and M .Sc. degree
in computer engineering from Centre for Ad-
Model validation and higher order sliding mode controller
vanced Studies in Engineering (CASE) Islama-
design for a research reactor , Annal of Nuclear Energy, bad in 1994 and 2007, respectively. Currently, he
Vol. 36, pp. 37-45, 2009. is working towards his PhD degree from Mo-
[18] Q. R. Butt, A. I. Bhatti, M. Iqbal, M. A. Rizvi, R. Mufti, I. H. hammad Ali Jinnah University (MAJU) Islama-
Kazmi, Estimation of Automotive Engine Parameters Part I: bad. His research interests include sliding mode
Discharge coefficient of throttle body, in Proc. IEEE, 6th control, fractional control, and neural network.
International Bhurban Conference on Applied Sciences and
Technology, 2009, pp. 275-280.
[19] M. Iqbal, A. I. Bhatti, S. I. Ayubi, Q. Khan, Robust Parame-
ter Estimation of Nonlinear Systems Using Sliding-Mode
Dr. Saeed-ur-Rehman got his PhD from GA.
Differentiator Observer, IEEE Transactions on Industrial Tech. Atlanta, Georgia, USA. He specializes in
Electronics, Vol. 58, No. 2, pp. 680-689, 2011. digital control systems and power electronics.
[20] X. Rain, M. Hilairet, R. Talj, Second order sliding mode Dr. Rehman has more than 15 years of industrial
current controller for the switched reluctance machine, in and academic experience. Currently he is a
Proc. IEEE, 36th Annual Conference on Industrial Electron- professor at Centre for Advanced Studies in
ics Society, 2010, pp. 3301-3306. Engineering (CASE) Pakistan. He is also asso-
[21] M. Defoort, F. Nollet, T. Floquet, W. Perruquetti,Higher ciated with CARE from where he has developed
several embedded systems for ruggedized indus-
order sliding mode control of a stepper motor Proc. IEEE
trial/military applications. He has authored
Conference on Decision & Control, Vol. 1, 2006, pp.4002- several papers and holds a US patent on sensorless motor control.
4007.
[22] M. Defoort, F. Nollet, T. Floquet, W. Perruquetti, A Third Order
Sliding Mode Control of a Stepper Motor IEEE transactions
on Ind. Electronics, Vol. 56, No. 9, pp.3337-3346, 2009.
[23] M. Rashed, K.B. Goh, M.W. Dunnigan, P.F.A. McConnell,
A.F. Stronach , B.W. Williams, Sensorless second-order
sliding-mode speed control of a voltage-fed induction-motor
drive using nonlinear state feedback, in Proc. IEE, Electric
Power Application, Vol. 152, No. 6. 2005, pp. 1127-1136.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 387
2
Student, Computer Science and Engineering, Anna University
Chennai 600 044, Tamilnadu, India.
3
Professor, Anna University of Technology, Tirunelveli
Abstract
Wiki, the collaborative web authoring system makes Web a huge 2.2 Ontology
collection of information, as the Wiki pages are authored by An ontology is the formal explicit specification of shared
anybody all over the world. These Wiki pages, if annotated
conceptualization. A conceptualization refers to an
semantically, will serve as a universal pool of intellectual
resources that can be read by machines too. This paper presents
abstract model of some phenomenon in the world that
an analytical study and implementation of making the Wiki identifies the relevant concepts of that phenomenon.
pages semantic by using HTML5 semantic elements and Explicit means that the type of concepts used and the
annotating with microdata. And using the semantics the search constraints on their use are explicitly defined. Formal
module is enhanced to provide accurate results. refers to the fact that the ontology should be machine
understandable.
Keywords: HTML5, Microdata, Search, Semantics,
Annotation, Wiki 2.3 Wiki
possibly an additional automatic or semi-automatic Lets start with three basic properties:
extraction of metadata from wiki articles to simplify name (your full name)
the annotation process for example, by photo (a link to a picture of you)
topic(EUprojects)or even indirectly (meeting minutes url (a link to a site associated with you, like a weblog
of EU projects) or a Google profile)
Some of these properties are URLs, others are plain text.
3. HTML5 Each of them lends itself to a natural form of markup,
HTML5 is the 5th major revision of the core language of even before you start thinking about microdata or
the World Wide Web: the Hypertext Mark-up Language vocabularies or whatnot. Imagine that you have a profile
(HTML), initiated and developed mainly by WHATWG page or an about page. Your name is probably marked
(Web Hypertext Applications Technology Working up as a heading, like an <h1> element. Your photo is
Group).Started with the aim to improve HTML in the area probably an <img> element, since you want people to see
of Web Applications, HTML5 introduces a number of it. And any URLs associated your profile are probably
already marked up as hyperlinks, because you want
semantic elementswhich include: <section>, <nav>,
people to be able to click them. For the sake of discussion,
<article>, <aside>, <hgroup>, <header>, <footer>, <time>
lets say your entire profile is also wrapped in a <section>
and <mark>.
element to separate it from the rest of the page content.
Thus:
These are some of the tags that have been introduced just
<section itemscope itemtype= "http://data-
to bring semantics in web pages, with no effect on the way vocabulary.org/Person">
it is displayed. They behave much like a grouping element <div itemprop="title" class="title"> President
such as <div> as far as displaying them is concerned. This </div>
means if an old browser cannot recognize these tags it will <div itemprop="name" class="name">
handle them much similar to the way a grouping element Mark Pilgrim
is handled. The semantic elements tell the browsers and </div>
web crawlers clearly the type of content contained within </section>
the element. For instance states explicitly that the figures The major advantage of Microdata is its interoperability,
within the element represent a time. i.e any RDF representation of an ontology can be mapped
to HTML5 microdata.
4. Microdata
5. Existing System
Apart from the semantic elements HTML5 introduces
Microdata the way of annotating web pages with MediaWiki is a free software wiki package written in
semantic metadata using just DOM attributes, rather than PHP, originally for use on Wikipedia. It is now used by
separate XML documents. Microdata annotates the DOM several other projects of the non-profit Wikimedia
with scoped name/value pairs from custom vocabularies. Foundation and by many other wikis. MediaWiki is an
Anyone can define a microdata vocabulary and start extremely powerful, scalable software and a feature-rich
embedding custom properties in their own web pages. wiki implementation, that uses PHP to process and display
Every microdata vocabulary defines a set of named data stored in its MySQL database. Pages use
properties. For example, a Person vocabulary could define MediaWiki's wiki-text format, so that users without
properties like name and photo. To include a specific knowledge of HTML or CSS can edit them easily.
microdata property on your web page, you provide the 5.1 MediaWiki Architecture
property name in a specific place. Depending on where
you declare the property name, microdata has rules about In the architecture of MediaWiki as shown in Fig.1 the top
how to extract the property value. Defining your own two layers hardly have anything to do with semantic
microdata vocabulary is easy. First, you need a annotation. The layers of concern are the Logic Layer and
namespace, which is just a URL. The namespace URL the Data Layer; the major part lies in Logic Layer.
could actually point to a working web page, although The following figure shows the architecture of
thats not strictly required. Lets say I want to create a MediaWiki:
microdata vocabulary that describes a person. If I own the
data- vocabulary.org domain, Ill use the URL http://data-
vocabulary.org/Person as the namespace for my microdata
vocabulary. Thats an easy way to create a globally unique
identifier: pick a URL on a domain that you control. In
this vocabulary, I need to define some named properties.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 390
$atParaStart = preg_match('/^<p>\{__:/',$text);
$atParaEnd = preg_match('/__\}<\\/p>/',$text);
$pos = strpos($text,'{__:');
if($pos == false)
return $text;
$pattern = array(
'/(?<=\{__:)(\w+)/' =>
'http://data-vocabulary.org/'.'\\1'.'">',
'/(?<=@)(\w+)(:")([^"]*)(")/' =>
'\\1'.'">'.'\\3'.'</span>',
if($atParaStart==1) {
$text = preg_replace('/^<p>\{__:/','{__:',$text);
$pattern['/(?<=\{__:)(\w+)/'] = 'http://data-
vocabulary.org/'.'\\1'.'"><p>';
}
Fig. 5 Block diagram of Semantic Mediawiki
if($atParaEnd==1) {
Controller is the module that despatches the requests
$text = preg_replace('/__\}<\\/p>/','__}',$text);
from the user to the corresponding module. However the
Squid (proxy server) may serve the user with cached $pattern['/ __\}/'] = '</p></span>'; } $text = preg_replace(
array_keys($pattern), array_values($pattern), $text);
results from previous requests.
wfProfileOut( __METHOD__ );
Microdata vocabulary is the actual definition of the class
return $text; }
to which the object described in the page belongs to.
InHTML5 microdata this is referred to by the value of The wiki markup to include microdata annotation is:
itemtype attribute.
{__:ItemType
Editor module provides the interface through which a
@itempropName:value
user can edit or create wiki pages. If the user edits an
already existing page, the corresponding page is fetched __}
from the database and the HTML markup is converted
For instance, to include microdata annotation about a
into wiki markup and is displayed in the editor interface.
person, the Wiki markup is as follows:
After the user edits the contents and clicks Save page the
modified contents are given to the parser to be converted {__:Person
to HTML markup.
@name:"Richard Stallman"
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 392
@title:"President"
@nickname:"RMS"
__}
Here, the ellipsis are used to represent some arbitrary
content, just as placeholder; not part of the syntax. This
Wiki markup on passing the Parser module becomes:
<span itemscope
itemprop="http://data-vocabulary/Person">
<span itemprop="name">Richard Stallman</span>
<span itemprop="title">President</span>
<span itemprop="ickname">RMS</span>
</span>
This approach differs from the earlier proposals of
semantic wiki using RDF (such as KawaWiki [4] and
Rhizome [5]) in simplicity. The users effort to annotate a Fig. 6 Class diagram of the search implementation
web page is reduced drastically as semantic HTML
elements and attributes serve the purpose of their XML Flowchart:
counterparts. Thus to make the e-resources most updated The control flow of the search module in Mediawiki is
as well as semantic without much strain HTML5 depicted in the following figure. It involves tasks such as
microdata suits best. preprocessing and normalizing the search text, replacing
get arguments with corresponding prefixes, resolving
6.2 Mediawiki Search module
namespaces and so on.
The search module of Mediawiki is organised as one base
class named SearchEngine and 6 subclasses.
SearchUpdate, one of the subclasses, is to update the
search index in the database whereas database specific
operations are carried out by the other 5 classes, one for
each of MySQL, MySQL4, PostgreSQL, SQLite, Oracle
and IBM-DB2.
Google determines that this about page should rank in [9] Tim Berners-Lee, James Hendler and Ora Lassila, The Semantic Web,
Scientific American, May 2001.
the results, and Google decides that the microdata
properties it originally found on that page are worth [10] Mark Pilgrim, Developer advocate at Google, Inc. Apex, NC,
displaying, then the search result listing might look http://diveintohtml5.org/extensibility.html, 2010.
something like the one shown in the screen-shot below. [11] Web Hypertext Applications Technology Working Group,
The output shown above can be tested at http://whatwg.org/specs/web-apps/current-work/multipage,
September 2010.
http://www.google.com/webmasters/tools/richsnippets by
entering the URL [12] Mediawiki manual
http://csmit.org/wiki/index.php?title=Richard_Stallman in http://www.mediawiki.org/wiki/Manual:MediaWiki_architecture,
June 2010.
the input field.
[13] http://www.mediawiki.org/wiki/Manual:Database_layout
CONCLUSION AND FUTURE WORK
The project enhances Mediawiki to recognize the new
Semantic Wiki markup developed and to produce
microdata annotations accordingly. Thus the huge
collection of Wiki pages can be made to serve as a pool of
various information, for not only human beings, but also
machines.
This can be further extended by making the entire
output to be in HTML5, making use of the semantic
elements. The search module of Mediawiki is to be
enhanced to take advantage of the semantic annotations to
provide accurate results with more helpful information
than just excerpt of text.
REFERENCES
[1] Vignesh Nandha Kumar K R, Pandurangan N, Vijayakumar R and
Pabitha P, Semantic Annotation of Wiki using Wiki markup for
HTML5 Microdata, International Journal of Engineering Science
and Technology, Vol. 2, Issue 12, pp. 7866-7873, 2010.
[2] Mohammed Kayed and Chia-Hui Chang, Member, IEEE,
FiVaTech: Page-Level Web Data Extraction from Template
Pages, IEEE Transactions on Knowledge and Data Engineering,
Vol. 22, No.2, pp. 249-263, 2009.
[3] Amal Zouaq and Roger Nkambou, Member, IEEE, Evaluating the
Generation of Domain Ontologies in the Knowledge Puzzle Project,
IEEE Transactions on Knowledge and Data Engineering, Vol. 21,
No.11, pp. 15591572, 2008.
[4] Jinhyun Ahn, Jason J. Jung, Key-Sun Choi, Interleaving Ontology
Mapping for Online Semantic Annotation on Semantic Wiki,
IEEE/WIC/ACM International Conference on Web Intelligence and
Intelligent Agent Technology, 2008.
[5] Kensaku Kawamoto, Yasuhiko Kitamura, and Yuri Tijerino Kwansei,
Gakuin University, KawaWiki: A Semantic Wiki Based on RDF
Templates, Proceedings of the 2006 IEEE/WIC/ACM International
Conference on Web Intelligence and Intelligent Agent Technology
(WI-IAT 2006 Workshops)(WI-IATW'06 , 2006.
[6] Adam Souzis, Building a Semantic Wiki, IEEE Intelligent Systems,
Vol. 20, No. 5 September/October 2005.
[7] Spinning the Semantic Web, Edited by Dieter Fensel, James A.
Hendler, Henry Lieberman and Wolfgang Wahlster, Foreword by
Tim Berners-Lee.
[8] Sebastian Schaffert, Salzburg Research Forschungsgesellschaft ,
Franois Bry, Ludwig-Maximilian University of Munich , Joachim
Baumeister, University of Wrzburg , Malte Kiesel, DFKI GmbH ,
Semantic Wikis, IEEE Software, 2008.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 395
2
Department of Computer Science & Engineering, Gautam Buddh Technical University, IET
Lucknow, U.P. 226021, India
3
School of Computer Engineering & Information Technology, Shobhit University, Professor
Meerut, U.P. 250110, India
4
School of Computer Engineering & Information Technology, Shobhit University, Professor
Meerut, U.P. 250110, India
change all PINs to be the same. Also, people are system establishes a subjects identity (or fails if the
subject is not enrolled in the system database) without the
often lax about the security of this information and subject having to claim an identity.
may deliberately share the information, say with a The term authentication is also frequently used in the
biometric field, sometimes as a synonym for verification;
spouse or family member, or write the PIN down and actually, in the information technology language,
even keep it with the card itself. Biometric techniques authenticating a user means to let the system know the
user identity regardless of the mode (verification or
[19] may ease many of these problems: they can identification).
confirm that a person is actually present (rather than The banking and financial sector has adopted this system
wholeheartedly because of its robustness and the
their token or passwords) without requiring the user advantages it provides in cutting costs and making
to remember anything. In this paper, we explore how processes more streamlined. The technology started out as
a novelty however due exigencies in the banking sector
to use UML sequence diagrams to support the needs characterized by decreasing profits it became a necessity.
of finger print ATM verification system. First, we The use of Biometric ATM's based on finger print
recognition technology has gone a long way in improving
review methods for composing sequence diagrams
customer service by providing a safe and paperless
that support flexible finger print ATM modeling. Then, banking environment.
Identification of right user by the use of face recognition
we show how determining required information
technology is the latest form of biometric
content can be represented as finite state machine to ATMs. Identification based on the different walk style
while entering in ATMs is used in gait based ATMs.
guarantee correct, cohesive diagrams. A generic
Benefits of biometric technology: Since biometric
approach is described; with supporting finger print technology can be used in the place of PIN codes in
ATMs, its benefits mostly accrues to rural and illiterate
ATM verification system incorporating data, state,
masses who find it difficult to use the keypad of ATMs.
and timing information. Finally, the more commonly Such people can easily put their thumbs on the pad
discussed transaction processing model is revisited available at ATMs machines and proceed for their
transactions.
to illustrate system differences. Biometric technology provides strong authentication, as it
uses the unique features of body parts. This helps reduce
2. Biometric Approach ATM transaction the chances of occurring frauds in ATM usage.
through Finger Print recognition Though use of biometric technology has its high cost
implications to banks, several other costs of conventional
A biometric system is essentially a pattern recognition
ATMs like re-issuance of password, helpdesk etc will be
system that recognizes a person by determining the
reduced, which will be a positive factor for banks to go for
authenticity of a specific physiological and / or behavioral
biometric ATMs.
characteristic possessed by that person. An important issue
in designing a practical biometric system is to determine 3. Terminology Used
how an individual is recognized. Depending on the
application context, a biometric system may be called 3.1 Scenario, Sequence Diagrams, State Chart &
either a verification system or an identification system Message Sequence Charts:
[16]:
1. A verification system authenticates a persons identity A scenario is a sequence of events that occurs during one
by comparing the captured biometric characteristic with particular execution of a system. A scenario describes a
her own biometric template(s) pre-stored in the system. It way to use a system to accomplish some function [5].
conducts one-to-one comparison to determine whether the Scenarios can be expressed in many forms, textual and
identity claimed by the individual is true. A verification graphical, informal and formal. Sequence diagrams
system either rejects or accepts the submitted claim of emphasize temporal ordering of events, whereas
identity. collaboration diagrams focus on the structure of
2. An identification system recognizes an individual by interactions between objects. Each may be readily
searching the entire template database for a match. It translated into the other. State chart diagrams represent the
conducts one-to-many comparisons to establish the behavior of entities capable of dynamic behavior by
identity of the individual. In an identification system, the specifying its response to the receipt of event in stances.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 397
Typically, it is used for describing the behavior of classes, 3.4 Object Constraint Language (OCL):
but state charts may also describe the behavior of other The Object Constraint Language (OCL) is an expression
model entities such as us e-cases, actors, subsystems, language that enables one to describe constraints on object
operations, or methods. Message sequence charts oriented models and other object modeling artifacts.
constitute an attractive visual formalism that is widely A constraint is a restriction on one or more values of (part
used to capture system requirements during the early of) of an object oriented model or system. OCL is the part
design stages in domains such as ATM transaction via of the Unified Modeling Language (UML), the OMG
fingerprint recognition [15]. (Object Management Group, a consortium with a
3.2 Composition of Scenarios: membership of more than 700 companies. The
organization's goal is to provide a common framework for
A crucial challenge in describing formal verification of
developing applications using object-oriented
fingerprint ATM recognition is the composition of
programming techniques) standard for object oriented
scenarios. In order to be adequately expressive, sequence
analysis and design.
diagrams must reflect the structures of the programs they
OCL has been used in a wide variety of domains, and this
represent. In this paper, we survey approaches to modeling
has led to the identification of some under specified
execution structures and transfer of control, and select a
areas in the relationship between OCL and UML.
method that lends itself to Fingerprint Verification
OCL can be used for a number of different purposes:
System.
1. to specify invariants on classes and types in the
Our objective is to refine a model that utilizes sequential,
class model
conditional, iterative, and concurrent execution. As many
2. to specify type invariants for Stereotypes
ideas exist, our task is to determine which are appropriate
3. to describe pre- and post- conditions on
for Fingerprint Verification System. Hsia et al. [5]
Operations and Methods
discusses a process for scenario analysis that includes
4. to describe guard
conditional branching. Glinz [2] includes iteration as well.
5. as a navigation language
Koskimies et al.[8] and Systa [13] present a tool that
6. to specify constraint on operations
handles algorithmic scenario diagrams - sequence
In OCL, UML operation semantics can be expressed using
diagrams with sequential, iterative, conditional and
pre and post condition constraints The pre condition says
concurrent behavior. We use elements of each, for a
what must be true for the operation to meaningfully
combined model that allows sequential, conditional,
execute The post condition expresses what is guaranteed
iterative, and concurrent behavior.
to be true after execution completes
Another objective is to model transfer of control through
1. About the return value
sequence diagram composition. The main decision to
2. About any state changes (e.g. instance variables)
make is where to annotate control information. One
approach is to include composition information in 4. Proposed Formal Verification of Finger Print ATM
individual diagrams.
Transaction through Real Time Constraint Notation
3.3 Finite State Machines (FSM): (RTCN) : -
By Dr. Matt Stallmann & Suzanne Balik: A finite-state Now we are going to demonstrate the formal verification
machine (FSM) is an abstract model of a system (physical, of ATM transaction through Fingerprint Verification
biological, mechanical, electronic, or software). Model with the help of Sequence Diagram their
A finite state machine (FSM) is a mathematical model of a corresponding Finite State Machine and their
system that attempts to reduce the model complexity by corresponding Real Time Constraint Notation with the
making simplifying assumptions. Specifically, it assumes help of Object Constraint Language (OCL). There are four
1. The system being modeled can assume only a finite objects exchanging messages: the user, the ATM, the
number of conditions, called states. consortium, and the bank. In this example, State charts are
2. The system behavior within a given state is essentially generated for the ATM object only. The scenarios share
identical. the same initial condition.
3. The system resides in states for significant periods of
time. 1. Through Sequence Diagrams (SDs):
4. The system may change these conditions only in a Case 1: Transaction Fail due to mismatch
finite number of well defined ways, called transitions. Finger Print Impression (FPI) at server site database:
5. Transitions are the response of the system to events.
6. Transitions take (approximately) zero time.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 398
In this case, ATM transaction fails due to mismatch of Figure 2 Sequence Diagram for Fingerprint Verification
finger print through finger print database (DB) file server ATM, Case2: Correct FPI, Successful Transaction
site: 2. Finite State Machine corresponding to Sequence
Diagrams (SDs):
The above two cases can be represented with help of their
corresponding Finite State machine as :
Acknowledgments
Singh, Vivek thanks Darbari, Manuj without his essential
guidance this research paper would not have been possible .
Singh, Vivek is currently working as Assistant Professor in the
Department of I. T. at BBDNITM, Lucknow. He has over 10 years
of teaching & experience. Having done his B.Tech in Computer
Science & Engineering from Purvanchal University in 2001,
M.Tech from U.P.Technical University, Lucknow in 2006, he is
pursuing his Ph.D. from Shobhit University, Meerut.
Mukesh Kumar, Pothula Sujatha, P. Manikandan, Madarapu Naresh Kumar, Chetana Sidige and Sunil Kumar Verma*
Abstract
Small botnets are tough to detect and easy to control by the enormous cumulative bandwidth if controlled by the
botmaster. Having a small botnet with high speed internet Botmaster, to attack any target on the internet. The
connectivity than large but slow connection is more effective concept of botnets is evolved from just the last decade,
and dangerous in nature. According to diurnal dynamics
due to open source communities day by day new
studies only about 20 percent of computers are always online,
variants of bots with new stealthy protocols and
to maximize a botnet attack power, botmaster should know
diurnal dynamics of her botnet. In our project we are infection capability are attacking and affecting the
designing a peer-to-peer bot. This bot after infecting any of victim.
the system first check the internet connection speed of the
interface, if it is not up to the desired speed i.e. 2 Mbps the Botnet-based attacks are becoming more powerful
bot will kill itself because slow speed bots are not desired. In and dangerous in such case security professionals
another scenario bot will sense is it in a honeypot trap? If so needs to understand the newly developed bots. For
it will kill itself so that the whole botnet could not be exposed understanding and study of the bots various works has
to the defender. We will suggest the mitigation techniques to been done by the researchers across the world [4],
defend bots with these types of properties. [7],[8],[9],[10],[11]. Internet Relay Chat(IRC) based
botnets are the first kind of bots using C&C (Command
Keywords- Peer-to-Peer, Botnet, Honey pot, Firepower
& Control) architecture as a centralized systems.
1. Introduction Recent years are more prominent with new technology
based bots for their Command & Control. A new type
As technology for internet security matured Internet of the bot using Peer-to-Peer topology for the
malware, and Ransom ware domination also increased. spreading of command and control by the botmaster is
Users and organizations are suffering a lot by these more prominent. Various works have been done to
attack emerging trend[1]. These hackers became understand and create detection frameworks and
equipped with more advanced technologies and systems to detected and dismantle the botnets. Various
planning their attack in better-organized manner which detection mechanism for IRC based botnets are
is more dangerous than earlier years. The botnet crime proposed[12],[15],[16]. As now a days Peer-to-Peer
results E-mail spam, extortion through denial-of- botnets are more dangerous in nature detection
service attacks, identity theft, data theft and click fraud framework is proposed for them [12],[13],[14]. As per
resource consumption etc. A botnet is a network of our understanding new kind of bots can be generated
systems affected by malwares known as bots. These easily for creating and developing a mitigation system
bots has one specific property that distinguish them for the botnets we have to understand their capability
with other malwares, they can be remotely operated and activity. For this purpose development framework
and controlled. This specific property of bots makes for new bots should be created.
them weapons for various denials of service attacks.
These bots are distributed over the internet having
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 402
2. Related Works friend bots. Which keep the whole botnet easy to
exposed if one of the bot got captured by the defender.
Bots and Botnets are very hot topics for last few years Peer list construction is the main concept behind any
[10], [1]. The first ever Peer-to-Peer bot Storm Bot had P2P botnet which also leave the complete bot exposed
control over a million systems. In 2003 first ever bots any time to the Defender. [5],[6],[17] Authors giving
and botnets properties and overview is discussed by an idea of Honey pot aware bots, and botnets.
Puri and McCarty. Today the main concentration of Honeypots are the only way to observe and understand
bots researcher are on Peer-to-peer bots because of the activities of a bot. That also makes botnet prone to
their sustainability and robust network topology be exposed to the defender and help them to create a
formation makes tough to detect and dismantle. mitigation system for the botnet. Bot masters.
Various authors proposed different types of Peer-to-
Peer bots. [3] developed a Stochastic Model of Peer-to- 3. Proposed P2P Botnet Architecture
Peer botnet to understand different factors and impact
of the growth of the botnet. The botnet stochastic 3.1 Classification of Our Bots
model was constructed in the Mobius software tool,
We classified our bots very extensively so that it
which was designed to perform discrete event
becomes easy to control and operate the botnet by the
simulation and compute analytical/numerical solution
botmaster.This classification is mainly to refine the
of models by inputting various input parameters. This
bots used in the attack for an effective firepower and
kind of research helps to understand the behavior of
less prone for the exposure to the defender. First of all
botnets and it became easy to create mitigation systems
we will group our bots on the basis of their bandwidth
and framework for these bots. In botnet technology
if the infected system has a internet connectivity to the
various works are going on for the detection and
outside world equal or greater than our specified
mitigation of the Peer-to-Peer botnets.
bandwidth then only we will consider them to build our
Authors has proposed an advanced hybrid peer-to-peer botnet these kind of bots we call as Live bots,
botnet [19] which concentrated on the problem of are otherwise we will discard the further infection and
using the liability constraint of the security professional these kind of bots will be called as Dead bots and they
to detect installed honeypot, because honeypots are not will not participate in further creation of the botnet.
allowed to participate in the real attack scenario. But Further we will classify Live bots in two groups one
still some probability remains for the capture of the Peer bots which will have global IP addresses without
bots and reverse engineered to understand their firewall or proxy servers in between, and rest all bots
strength. This lack of security in bots capture by the including 1) bots with global IP addresses with firewall
defender make the whole botnet susceptible to get or proxy 2) bots with dynamically allocated global IP
exposed. addresses 3) bots with private IP addresses. We will
call second group of bots as Non-peer bots. Further
Significant exposure of the network topology when one bots are dedicated for the purpose of either infecting
of the bot is captured, making easy for the botmaster other victims or only for attack purpose. If the bot is
for the overall control of the botnet. They also included dedicated for infection of other victims then the code
some concept of Honey Pot awareness in their bot module will send the existing peer list to newly
system. But still few problems with communication infected bot. In case of attack bots the code module
channel and the capture and re engineering of the bot is will be spam emails, DDoS command and control
remains.[7] Predicting a new botnet from the handling.
framework and comparing its performance with known
ones. Loosely Coupled peer-to-Peer botnet lcbot, We will mainly concentrate to prevent detection of the
which is stealthy and can be considered as a Peer bots because they are security bottle neck for our
combination of existing P2P botnet structure. Their botnet to get exposed to the defender as they only
botnet architecture still follows the idea of Buddy contain peer list or seed list information of other Peer
list or routing information of the infected host or bots. The Peer bots will be able to act as a server for
other Peer and Non-Peer bots and client for other Peer
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 403
bots. Non-Peer bots will be able to act only as a client, I. Install the initial Infection files
they will have entry of other Peer bots only in their II. Check the connection speed of victim
peer list. III. Decide whether the compromised host is Live
or Dead Bot
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 404
table is important to maintain the robust connectivity of 4. Simulation and Experimental Results
the bot to the network. Our emphasis will be here to
maintain the correct finger table as accurate as We are presenting simulation results and snapshots of
possible. our proposed peer to peer suicide botnet. Our
3.4 Botnet Communication Channel Architecture experiment is still in its inception stage, in its current
scenario we became successful to implement and
Each Peer Bot will contain list of its next two peer execute few properties of our proposed model of
bots and other two non-peer bot information in seed botnet. In the simulation model implemented using
list. The non-peer bot will have only two entries of peer java technology the botmaster is able to command bots
bot information with the condition that they both peer through listing their IP addresses. If necessity arises
bot will contain the information of each other. The bot botmaster sends kill command to the bots to destruct
master will pass the command to any one of the Peer itself.
bot depending upon the diuranal dynamics that
particular bot will be selected for the first command
passing to the whole botnetwork. After getting the
command by the botmaster the peer bot will share this
command to its next neighbor peer bot as will
connected non-peer bot that will ensure the effective
communication, for the purpose of command passing
priority will be given to the peer bot. Because peer bot
can work as client as well as server too, and connected
to other peer bots . On the basis of this topology the
communication will be handled.
Fig:3 Bot Master Control Interface
Fig:4 Bot Monitoring by Bot Master
Fig 2: Bot Communication
Fig:5 Selecting Bot IP Address by Bot Master to send Kill Command
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 405
Digital Forensics Conference, Edith Cowan University,
Perth Western Australia, December 4th 2006.
[7] A Framework for P2P Botnet, Su Chang, Linfeng
Zhang, Yong Guan, Thomas E. Daniel,2009
International Conference on Communications and
Mobile Computing
[8] The Zombie Roundup: Understanding, Detecting and
Disrupting Botnets, Evan Cooke, Rarnam Jahanian,
Danny McPherson Electrical Engineering and
Computer Science Department Arbor Networks.
[9] wide-scale Botnet Detection and Characterization
Anestis Karasaridis, Brain Rexroud, David Hoeflin
[10] A Survey of Bots used for Distributed Denial of Service
Attack, Vrizlynn L. L. Thing, Morris Solman, Naranker
Dulay, http://www.doc.ic.ac.uk.
[11] Criminology of Botnets and their detection and Defense
Fig:6 IP address listing of Bot by Bot Master Methods, Jivesh Govil, Jivika Govil, IEEE EIT 2007
Proceedings, IEEE 2007
5. Conclusion [12] Automatic Discovery of Botnet Communities of Large
Scale Communication Network, Wei Lu, Mahbood
Tavallaee and Ali A Ghorbani, ASIACCS09, March
Implementation of new types of bots will facilitates to 10-12, 2009, Sydney, NSW, Australia ACM 2009.
understand the future bots which can be created by the [13] P2P botnet detection using behavior clustering &
attackers. Study and simulation results of our bot Statistical Tests, Su Chang, Thomas E. Daniels, AISec
provide framework to understand the bot working and 09, November 9, 2009 ACM 978-1-60558-781-3/09/11
[14] A Proposed Framework for P2P Botnet Detection,
there communication channel architecture. This bot is Hossein Rouhani Zeidanloo, Azizah Bt Abdul Manaf,
tough to control because of the peer network topology Rabiah Bt Ahmad, Mazdak Zamani, Saman Shojae
but harder to reverse engineered or trapped by the Chaeikar, IACSIT International Journal of Engineering
And Technology, Vol, No. 2, April 2010,
honey pots. It provide small but high fire power bot [15] Detecting Botnets with Tight Command and Control,
network to the bot master which is tough to shut down. W. Timothy Strayer, Robert Walsh, Carl livadsa, David
Lapsley,
References [16] [16] A Novel Approach to Detect IRC-based Botnets,
Wei WANG, Binxing FANG, Zhaoxin ZHANG, Chao
LI, 2009 International Conference On Network
[1] B. McCarty, Botnets: Big and Bigger, IEEE Security
Security, Wireless Communication and Trusted
& Privacy Magazine, vol. 1, no. 4, pp. 87-90, July-Aug.
Computing, 2009 IEEE.
2003.
[17] Honeypot-Aware Advanced Botnet Construction and
[2] Elizabeth Van Ruitenbeek and William H. Sanders,
Maintenance Cliff C. Zou Ryan Cunningham ,
Modeling Peer-to-Peer Botnets, Quantitative
Proceedings of the 2006 International Conference on
Evaluation of Systems 2008 IEEEComputerSociety,DOI
Dependable Systems and Networks (DSN06) IEEE.
10.1109/QEST.2008.43, 5th International Conference
[18] VMM-Based Framework for P2P Botnets Tracking and
on Quantitative Evaluation of SysTem Palais du Grand
Detection LingYun Zhou , 2009 International
Large at Saint Malo, France 14th-17th September, 2008
Conference on Information Technology and
[3] Julian B. Grizzard, Vikram Sharma, Chris Nunnery, and
Computer Science
Brent ByungHoon Kang, Peer-to-Peer Botnets:
Overview and Case Study, In USENIX Workshop on [19] Ping Wang, Sherri Sparks, and Cliff C. Zou Member
Hot Topics in Understanding Botnets (HotBots07) IEEE, An Advanced Hybrid Peer-to-Peer Botnet,
April 10 2007, Cambridge, MA, USA IEEE Transactions On Dependable and Secure
Computing, Vol. 7, No. 2, April-June 2010 page
[4] Justin Leonard, Shouhuai Xu and Ravi Sandhu, A
Framework for Understanding Botnets, 2009
International Conference on Availability, Reliability
and Security Fukuoka Institute of Technology, Fukuoka,
Japan March 16-March 19.
[5] Ping Wang, Lei Wu, Ryan Cunningham,Cliff C. Zou,
Honeypot Detection in Advanced Botnet Attacks, Int.
J. Information and Computer Security, Vol. 4, Issue
1, 2010, DOI: 10.1504/IJICS.2010.031858
[6] Simon Innes, Craig Valli, Honeypots: How do you
know when you are inside one?, the 4th Australian
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 406
Mukesh Kumar received his Bachelor of Chetana Sidige is presently pursuing M.Tech
Technology degree in Computer Science (Final year) in Computer Science of
and Engineering from Uttar Pradesh Engineering at Pondicherry University. She
Technical University Lucknow, India, in did her B.Tech in Computer Science and
2009. He is currently pursuing his Information Technology from G. Pulla Reddy
masters degree in Network and Internet Engineering College, affiliated to Sri
Engineering in the School of Engineering Krishnadevaraya University. Her research
and Technology, Department of interest includes Network Security, Information
Computer Science, Pondicherry retrieval Systems and Software metrics. Currently the author is
University, India. His research interests working on Multilingual Information retrieval evaluation.
include Denial-of Service resilient protocol design, Cloud
Computing and Peer to Peer Networks.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 407
2
Islamic Azad University, Science and research branch
Tehran, Iran
3
Dept. of Computer engineering , Islamic Azad University, Tabriz branch
Tabriz, Iran
4
Dept. of Control engineering, K. N. Toosi University of Technology
Tehran, Iran
Abstract
The Residue Number System (RNS) is a non weighted system. It The RNS is determined by the set m of n positive coprime
supports parallel, high speed, low power and secure arithmetic. integers mi >1, which forms the base of the system. The
Detecting overflow in RNS systems is very important, because if dynamic range M of that system is given as a product of
overflow is not detected properly, an incorrect result may be the moduli mi where
considered as a correct answer. The previously proposed n
methods or algorithms for detecting overflow need to residue
comparison or complete convert of numbers from RNS to binary.
M m .
i 1
i (1)
We propose a new and fast overflow detection approach for
moduli set {2n-1, 2n, 2n+1}, which it is different from previous
methods. Our technique implements RNS overflow detection Any integer X [0, M ) has a unique representation
much faster applying a few more hardware than previous ( x1 , x 2 ,..., x n ) in RNS (m1 , m 2 ,..., m n ) . The residues
methods.
Keywords: Residue number system, overflow detection, moduli x i | X | mi , also called residue digits, are defined as
set {2n-1, 2n, 2n+1}, group number.
x i X mod mi , 0 x i mi . (2)
*
Corresponding Author
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 408
[8], communication [9], cryptography and image The number of groups required for this distribution is
processing [10, 11]. equal to and can be expressed as
Overflow detection is one of the fundamental issues in
efficient design of RNS systems. In a generic approach, x1 x 3 2 n 1
x 2 x3 2 n 2 n 1
2 n 1. (4)
overflow occurs in the addition of two numbers X and Y,
whenever Z ( X Y ) mod M be less than X. Thus, the So, we can concluded that length of any group namely l is
problem of overflow detection in RNS arithmetic is given as
equivalent to the magnitude of the problem of comparison M (2 n 1).2 n .(2 n 1)
[12, 13]. Another algorithm which proposed for overflow l 2 n .(2 n 1). (5)
2 1
n
detection in odd dynamic range M is a ROM-based
algorithm and called the parity checking technique. In this
In any of these groups there are 2n subgroups, because
method, parity indicates whether an integer number is
even or odd. Let operands of X and Y have the same parity x 2 x3 2n
, 0 2 n 1. (6)
and Z X Y . So, the addition process is with overflow,
if Z be an odd number [14, 15]. For signed RNS, overflow
For example, the value of for numbers in first group
occurs when the sign of the sum is different from the
with range [0, 22n + 2n) is shown in the following:
operands [16].
2n 1
, 0 2 n 2. (9) g ( X ) g (Y ) 1 2 n g ( X ) g (Y ) 1. (15)
For facility in implementation of proposed algorithm, we Finally (15) can be divided by two parts, that is
add the obtained group number from (9) with one. In this g ( X ) g (Y ) 2 n 1
case, if X be an integer, its group number g(X) is g ( X ) g (Y ) 2 n . (16)
g ( X ) g (Y ) 2 1
n
g ( X ) 1, 1 g ( X ) 2 n 1. (10)
Table 1 shows the distribution of numbers in dynamic Therefore, overflow can be detected by comparing the
range [0, 23n 2n) which is given as a product of the mis sum of the groups of operands with 2 n . If the sum exceeds
in moduli set {2n-1, 2n, 2n+1}. 2 n , overflow must occur. Notice that, overflow
probability should be again checked in third mode. For
Table 1: Distribution of Numbers
this purpose, g ( X ) g (Y ) 2 n is given 1-bit shift to right
Number Group
as 2 n / 2 2 n 1. Subsequently, it be compared with group
0 2n (2n+1) 1 1
number of sum of operands g ( Z ). In this case, if
2 (2 +1) 2[2n (2n+1)] 1
n n
2
g ( Z ) 2 n 1 , then overflow does not exist and
n 1
(2n2)[2n (2n+1)] (2n1)[2n (2n+1)] 1 otherwise g ( Z ) 2 overflow has occurred. Fig. 2
shows the overflow detection circuit in moduli set {2n-1,
Let X and Y are two operands in the process of addition 2n, 2n+1}.
Z X Y and also g(X) and g(Y) be the group number of
Table2: Group Number Calculations for RNS {15,16,17}
operands, respectively. It can be shown from Table 1 that:
i) if g ( X ) g (Y ) 2 n , no overflow will occur. X XRNS = | |15 g(X)
3. Hardware Implementation 2 n , if C 0
The group detection function is determined by Eq.(10) as g ( X ) g (Y ) 2 n , if C 1, M 0, P1 1 (21)
a sum of and 1. The value of is given by n
2 , otherwise.
2 n 1 . Since is computed as a residue modulo
2n-1 then, instead of subtracting 2 n 1
we can add its As mentioned above, whenever g ( X ) g (Y ) 2 n , is
additive inverse modulo 2 -1. An additive inverse modulo
n
required to overflow probability be checked again. For this
2n-1 is simply a negation of binary representation. For propose, g(Z) be compared with 2n-1. In this case, by
simplification reasons the additive inverse of 2n 1 is having the MSB of g(Z) as W S n 1 and its P2 P0:n 2 ,
denoted as can be said:
. (17) 2 n 1 , if W 1, P2 0
2 n 1 2 n 1 g (Z ) (22)
2 n 1 , otherwise.
So that, the binary form of (17) is n 1 ,..., 1 , 0 .
Thus (9) can be rewritten as the sum
The proposed method to overflow detection is
. (18) implemented as shown in Fig. 2. The circuit consists of
2 n 1
five main blocks: three group detection units, a unit for
generation of (MSB), output carry and P1 of g ( X ) g (Y )
From [18], an addition modulo (2n - 1) with redundant zero
elimination can be expressed as addition and the final post-processing unit to detecting
overflow.
a b 2 n 1 a b c out p 2 n (19)
where cout is a carry bit of a + b addition and p = 1 for a +
b = 1112. The sum cout + p is 0 for a + b < 2n 1 and 1
for a + b 2n 1 [1]. By assuming that Cin = cout + p, the
final form of (18) is then
C in . (20)
2n
computed in a simplified and new prefix structure Fig.4. depicts the structure of parallel prefix adder with
proposed in [18]. Hence, we no need to use a full n-bit end-around-carry (PPA with EAC). We applied it for
adder. doing addition operations in order to obtain the values of
and .
A parallel prefix adder and also parallel prefix adder with
end-around-carry are built from elements shown in Fig. 3.
Table 3: Comparison Area and Delay of proposed method with other methods using unit-gate model
The most effective overflow detection circuit based on [4] H. Bronnimann, I. Z. Emiris, V. Y. Pan, and S. Pion,
reverse converters can be built on the base on Converter I "Computing exact geometric predicates using modular
from [20]. In Converter I and also Converters II and III arithmetic with single precision", in Proc. 13th Annu. Symp.
Comput. Geom., ACM press , 1997.
from [20], the minimum delay is O(n) whereas, delay of [5] R. C. Debnath and D. A. Pucknell, "On Multiplicative
proposed method is factor of O(log2 n). Overflow detection in Residue Number System", Electronics
Letters, Vol. 14, No. 5, 1978
As seen from Table 3, the proposed approach for overflow [6] R. Conway and J. Nelson, "Improved RNS FIR Filter
detection in moduli set {2n-1, 2n, 2n+1} is faster than Architectures", IEEE Trans. On Circuits and Systems-II:
previous works. However, the hardware cost of the Express Briefs, Vol. 51, No.1, 2004.
presented method is more. It is essential to remark that, [7] P. G. Fernandez and et al., "A RNS-Based Matrix-Vector-
although the proposed design consumes more hardware Multiply FCT Architecture for DCT Computation", Proc,
but it demonstrates significant improvement in terms of 43th IEEE Midwest Symposium on circuits and Systems
2000, pp. 350-353.
delay, especially for large n. Furthermore, our proposed
[8] L. Yang and L. Hanzo, "Redundant Residue number System
method detects overflow without applying a complete Based ERROR Correction Codes", IEEE VTS 54th on
comparator or reverse converter. Vehicular Technology Conference, 2001, Vol. 3, pp. 1472-
1467.
[9] J. Ramirez, et al., "Fast RNS FPL-Based Communication",
5. Conclusions Proc. 12th Intl Conf. Field Programmable Logic, 2002, pp.
472-481.
Detecting overflow is one of the most important and [10] R. Rivest, A. Shamir, and L.Adleman, "A Method for
complex operations in residue number system. In this obtaining Digital Signatures and Public Key Cryptosystems",
paper, a novel and different method has been presented for Comm. ACM, Vol. 21, No. 2, 1948, pp. 120-126.
detecting overflow in moduli set {2n-1, 2n, 2n+1}. Our [11] J. Bajard, and L. Imbert, "A Full RNS Implementation of
RSA", IEEE Transactions on computers, Vol. 53, No. 6,
proposed technique is based on group of numbers which
2004, pp. 769-774.
leads to the correct result without doing a complete [12] B. Parhami, "Computer arithmetic: algorithms and
comparison or need to use the residue to binary converter. hardware designs", New York : Oxford University Press,
The presented approach has significant reduction in delay, 2000.
compared to other methods. [13] M. Askarzadeh, M. Hosseinzadeh and K. Navi, "A New
approach to overflow detection in moduli set {2n-3, 2n-1,
References 2n+1, 2n+3}", Second International Conference on Computer
[1] T. Tomczak, "Fast Sign Detection for RNS {2n-1, 2n, 2n+1}", and Electrical Engineering, 2009, pp. 439-442.
IEEE Transactions on Circuits and Systems I: Regular [14] M. shang, H. JianHao, Z. Lin and L. Xiang, "An efficient
Papers, Vol. 55, Iss. 6, 2008, pp. 1502-1511. RNS parity checker for moduli set {2n-1, 2n+1, 22n+1} and its
[2] N. S. Szabo and R. I. Tanaka, Residue Arithmetic and Its applications", Springer Journal of Science in China Series F:
Application to Computer Technology, New York: McGraw- Information Sciences", Vol. 51, No. 10, 2008, pp. 1563-
Hill, 1967. 1571.
[3] W. A. Chren, Jr. "A new residue number system division [15] A. Omondi and B. Premkumar, "Residue Number Systems:
algorithm", Comput, Math. Appl., Vol. 19, No. 7, 1990, pp. Theory and Implementation", Imperial College Press, 2007.
13-29.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 414
2
Department of Electronics and Communication Engineering
Amrita Vishwa Vidyapeetham,
Coimbatore, Tamilnadu, India
3
Department of Electronics and Communication Engineering
Amrita Vishwa Vidyapeetham,
Coimbatore, Tamilnadu, India
4
Department of Electronics and Communication Engineering
Amrita Vishwa Vidyapeetham,
Coimbatore, Tamilnadu, India
5
Department of Electronics and Communication Engineering
Amrita Vishwa Vidyapeetham,
Coimbatore, Tamilnadu, India
However, it is seen that the features required for the (pressures on valve: inlet and outlet), Ps E/P (transducer
purpose of classification of faults are usually selected output pressure), PSP (positioner supply pressure unit), PT
based on expert knowledge rather than automatically. A (pressure transmitter), Pz (positioner air supply pressure),
suboptimal feature set can compromise the accuracy of the S (pneumatic servo-motor), T1 (liquid temperature), TT
classification system, leading to poor performance of the (temperature transmitter), V (control valve),V1,V2 andV3
system. In the present study, a feature selection (cut-off valves), X (valve plug displacement), ZC (internal
mechanism which can identify the importance of features controller), ZT (stem position transmitter).
from a large feature set is proposed. The performance of
the proposed system is compared to that of six other
feature selection techniques and the proposed system was
found to outperform all the other techniques considered,
by producing highest classification accuracy for the
smallest number of faults. The dataset used for validating
the proposed system is the Development and Application
of Methods for Actuator Diagnosis in Industrial Control
Systems (DAMADICS) standard benchmark dataset.
Step 1: Extract statistical parameters (average, median, The probability of each solution based on its fitness, is
minimum, maximum, standard deviation, kurtosis, skew calculated by
and Variance) using moving windows, from each of the (2)
six measured parameters.
Step 2: Select important features for fault classification.
where fiti is the fitness value of the solution i and N is the
Step 3: Use the selected features for identification of the
number of solutions in the population. Candidate solutions
fault and its type using Nave Bayes classifier.
are produced using the formula
(3)
As can be seen, extracting eight parameters from six initial
features creates a feature set with (6*8) = 48 features and
when taken together with the initial feature set, the total where k { 1, 2, . . . , N} and j { 1, 2, . . . ,D} and is a
number of features in the feature set becomes 54. Also, the random number ranging between -1 and 1. This ensures
number of measurements made is large (a total of 65535 that values generated are different from those already
measurements for each of the initial features) as well. This existing. And also the newly generated solutions lie within
makes the task of feature selection quite challenging. the defined boundary. The parameter exceeding its limit is
set to its limit value.
3.1 Feature Selection
The performance of each candidate solution is compared
The proposed feature selection system is derived from with that of the existing solution. If the new solution has
ABC [25],[26] and mRMR [27] algorithms. This method equal or better fitness value than the old solution, the old
was developed using principles of the ABC and mutual one is discarded with the new one occupying its position.
information (MI) [28]. Else, the old one is retained. In other words, a greedy
selection mechanism is used for the selection process.
The ABC algorithm is an optimization algorithm that uses
the behavior of the bees while searching for food [25]. A If an optimal solution cannot be obtained from a
bee colony is an organized team work system where each population within the predefined number of cycles i.e.
bee contributes significant information to the system. limit then, that population is abandoned and replaced with
There are three types of worker bees which involve in a new population. ABC algorithm is used here to optimize
collecting nectar viz. employed bees, onlooker bees and the redundancy and relevance parameters of mRMR
scout bees. The ABC algorithm considers the position of function. The mRMR method proposed in [27] uses the
food source as the possible solution of the optimization principle of mutual Information. The mutual information
problem and the food source corresponds to the quality between two variables A and B can be defined as
(fitness) of the associated solution [26]. The number of the (4)
employed bees or the onlooker bees is equal to the number
of solutions in the population. The initial population of N
solutions is randomly generated. Each solution is a D- Maximum Relevance orders features based on the mutual
dimensional vector where D is the number of parameters information between individual features xi and target class
to be optimised. They are relevance and redundancy in this h such that the feature with the highest mutual information
case. The population of solutions is subject to repeated is the most relevant feature. The relationship is expressed
search processes by the employed bees, onlooker bees and as follows:
scout bees. A solution is randomly chosen and compared (5)
with the current solution. The objective function used here
will be the mRMR function. The fitness function of each Max Relevance often shows a high inter-dependence
solution is given by among the features. When two features are highly
dependent on one another, the class-discriminative power
(1) of these two features would not change much if either one
of them were to be removed and if not removed they
where f(i) is the objective function of the ith solution. If the become redundant as they convey the same characteristics.
fitness function of the new chosen solution is greater than The minimal redundancy condition can be added to select
the existing one, then the new solution is memorized and mutually exclusive features of the dataset. The following
the old one is discarded. The employed bees share the relationship helps establish the minimum redundancy
information i.e, fitness value of the solutions in their measure.
memory with the onlooker bees. (6)
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 418
Results of feature selection for incipient fault are shown in [3] X. Wang and D. Zhang, Optimization Method of Fault
Table 5. It can be observed that the proposed method Feature Extraction of Broken Rotor Bar in Squirrel Cage
shows accuracy of 70.7% for 36 features. Information gain Induction Motors, in Proceedings of the IEEE International
Conference on Information and Automation, 2010, pp. 1622-
shows better accuracy of 71.6% for 9 features. Also other
1625.
methods like chi square, Relief-F and gain ratio give
slightly better results when compared to the proposed [4] G. King, M. Tarbouchi and D. McGaughey, Rotor Fault
method. Detection in Induction Motors Using the Fast Orthogonal
Search Algorithm, in Proceedings of the International
Table 5: Accuracy in % vs. no.of features for abrupt small faults
Symposium on Industrial Electronics, 2010, pp. 2621-2625.
(coefficients of the proposed method: a=0.9106; b=0.0131)
no. Prop
[5] J. Liang and N. Wang, Faults Detection and Isolation
of osed
Based on PCA: An Industrial Reheating Furnace Case Study,
feat meth Info Gain Relief
in Proceedings of the IEEE International Conference on
ures od MIQ MID gain ratio chi2 -F
Systems, Man and Cybernetics, 2003, Vol. 2, pp. 1193-1198.
3 52.9 52.9 52.9 64.8 22.2 64.8 54.8
6 56.8 56.8 56.8 69.5 57.3 68.6 66.4 [6] J. Mina and C. Verde, Fault Detection Using Dynamic
Principal Component Analysis by Average Estimation, in
9 69.8 69.8 64.1 71.6 65.0 71.3 69.3
Proceedings of the 2nd International Conference on Electrical
12 67.3 67.3 67.3 71.1 66.0 71.1 67.9 and Electronics Engineering (ICEEE) and XI Conference on
15 70.4 70.3 70.4 71.0 69.2 71.0 68.2 Electrical Engineering (CIE), 2005, pp. 374-377.
18 70.4 70.3 70.4 69.8 71.0 69.8 68.1
[7] N. Tudoroiu and M. Zaheeruddin, Fault Detection And
21 70.4 70.3 70.4 69.8 71.3 69.8 67.9 Diagnosis Of Valve Actuators In HVAC Systems, in
24 70.4 70.3 70.4 69.8 71.0 69.8 68.0 Proceedings of the IEEE Conference on Control Applications,
27 70.4 70.3 70.4 69.8 69.8 69.8 69.9 2005, pp. 1281-1286.
30 70.4 70.3 70.4 69.8 69.8 69.8 69.9 [8] N. Tudoroiu and M. Zaheeruddin, Fault Detection and
33 70.4 70.3 70.4 69.9 69.9 69.9 69.9 Diagnosis of Valve Actuators in Discharge Air Temperature
36 70.7 70.7 70.4 69.9 69.9 69.9 69.9 (DAT) Systems, using Interactive Unscented Kalman Filter
70.7 70.7 69.9 69.9 Estimation, in Proceedings of the IEEE International
39 70.2 69.9 69.9 Symposium on Industrial Electronics, 2006, pp. 2665-2670.
42 70.7 70.7 70.7 69.9 69.9 69.9 69.9
45 70.7 70.7 70.7 69.9 69.9 69.9 69.9 [9] K. Choi, S. M. Namburu, M. S. Azam, J. Luo, K. R.
69.9 Pattipati, and A. Patterson-Hine, Fault Diagnosis in HVAC
48 70.0 70.0 70.7 69.9 69.9 69.9 chillers, IEEE Instrumentation & Measurement Magazine,
51 69.9 69.9 70.7 69.9 69.9 69.9 69.9 2005, pp. 24-32.
54 69.9 69.9 69.9 69.9 69.9 69.9 69.9
[10] L. Hu, K. Cao, H. Xu and B. Li, Fault Diagnosis of
It can be seen from the above results that the proposed Hydraulic Actuator based on Least Squares Support Vector
feature selection system is well capable of identifying the Machines, in Proceedings of the IEEE International
Conference on Automation and Logistics, 2007, pp. 985-989.
best features in a dataset and that the FDI system presented
in this paper can be successfully used for identifying faults [11] J. Gao, W. Shi, J. Tan and F. Zhong, Support vector
in actuators with a very high degree of accuracy. machines based approach for fault diagnosis of valves in
reciprocating pumps, in Proceedings of the IEEE Canadian
References Conference on Electrical and Computer Engineering, 2002, pp.
1622-1627.
[1] X. Dai, Z. Gao, T. Breikin and H. Wang, Disturbance
Attenuation in Fault Detection of Gas Turbine Engines: A [12] F. He and W. Shi, WPT-SVMs Based Approach for Fault
Discrete Robust Observer Design, IEEE Transactions on Detection of Valves in Reciprocating Pumps, in Proceedings
Systems, Man, And Cybernetics, 2009, Vol. 39, No.2, pp. 243- of the American Control Conference, 2002, pp. 4566-4570.
239.
[13] J. Middleton, P. Urwin and M. Al-Akaidi, Fault
[2] B. Ayhan, M. Y. Chow, H. J. Trussell and M.H. Song, A Detection And Diagnosis In Gas Control Systems, Intelligent
Case Study on the Comparison of Non-parametric Spectrum Measuring Systems for Control Applications, IEE Colloquium,
Methods for Broken Rotor Bar Fault Detection, in 1995, pp. 8/1 - 8/3.
Proceedings of the 29th Annual Conference of the Industrial
Electronics Society, 2003, Vol. 3, pp. 2835-2840. [14] D. Linaric and V. Koroman, Fault Diagnosis of a
Hydraulic Actuator using Neural Network, in International
Conference on Industrial Electronics (IClT), 2003,pp. 106-111.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 421
*2
Mahatma Gandhi Government Arts College, Mahe, P.O. New Mahe-673 311, U. T of Puducherry, India
Abstract Data mining has emerged as one of the major research and costs, such as those involved in the construction of
domain in the recent decades in order to extract implicit and useful terraces. Each country is known for its core competence.
knowledge. This knowledge can be comprehended by humans India's is agriculture. Yet, it only accounts for 17 per cent of
easily. Initially, this knowledge extraction was computed and the total Gross Domestic Product. With the pressure of
evaluated manually using statistical techniques. Subsequently, semi-
urbanization, it is going to be a challenge to produce food for
automated data mining techniques emerged because of the
advancement in the technology. Such advancement was also in the
more people with less land and water.
form of storage which increases the demands of analysis. In such Agriculture or farming forms the backbone of any country
case, semi-automated techniques have become inefficient. Therefore, economy, since a large population lives in rural areas and is
automated data mining techniques were introduced to synthesis directly or indirectly dependent on agriculture for a living.
knowledge efficiently. A survey of the available literature on data Income from farming forms the main source for the farming
mining and pattern recognition for soil data mining is presented in community. The essential requirements for crop harvesting are
this paper. Data mining in Agricultural soil datasets is a relatively water resources and capital to buy seeds, fertilizers, pesticides,
novel research field. Efficient techniques can be developed and labor etc. Most farmers raise the required capital by
tailored for solving complex soil datasets using data mining. compromising on other necessary expenditures, and when it is
still insufficient they resort to credit from sources like banks
Keywords Data Mining, Pattern Recognition, Soil Data Mining and private financial institutions. In such a situation, the
repayment is dependent on the success of the crop. If the crop
I. INTRODUCTION fails even once due to several factors, like bad weather
This Data mining software applications includes various pattern; soil type; improper, excessive, and untimely
methodologies that have been developed by both commercial application of both fertilizers and pesticides; adulterated seeds
and research centers. These techniques have been used for and pesticides etc. then he is pushed into an acute crisis
industrial, commercial and scientific purposes. For example, causing severe stress [58]. In addition, the plant growth
data mining has been used to analyze large datasets and depends on multiple factors such as soil type, crop type, and
establish useful classification and patterns in the datasets. weather. Due to lack of plant growth information and expert
Agricultural and biological research studies have used various advice, most of the farmers fail to get a good yield.
techniques of data analysis including, natural trees, statistical Most knowledge of soil in nature comes from soil survey
machine learning and other analysis methods [16]. This paper efforts. Soil survey, or soil mapping, is the process of
outlines research which may establish if new data mining determining the soil types or other properties of the soil cover
techniques will improve the effectiveness and accuracy of the over a landscape, and mapping them for others to understand
Classification of large soil datasets. In particularly, this and use. Primary data for the soil survey are acquired by field
research work aims to compare the performance of the data sampling and supported by remote sensing.
mining algorithms with soil limitations and soil conditions in The test dataset using for this research work collected from
respect of the following characteristics: Acidity, Alkalinity World Soil Information ISRIC (International Soil Reference
and sodicity, Salinity, Low cation exchange capacity, and Information Centre). Version 3.1 of the ISRIC-WISE
Phosphorus fixation, Cracking and swelling properties, Depth, database (WISE3-World Inventory of Soil Emission
Soil density and Nutrient content. The use of standard Potentials) was complied from a wide range of soil profile
statistical analysis techniques is both time consuming and data collected by many soil professionals world wide. All
expensive. If alternative techniques can be found to improve profiles have been harmonized with respect to the original
this process, an improvement in the classification of soils may Legend (1974) and Revised Legend (1988) of FAO-Unesco.
result. Thereby the primary soil data and any secondary data derived
In many developing countries, hunger is forcing people to from them can be linked using GIS to the spatial units of the
cultivate land that is unsuitable for agriculture and which can soil map of the world as well as more recent Soil and Terrain
only be converted to agricultural use through enormous efforts (SOTER) databases through the soil legend code.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 423
WISE3 is a relational database, compiled using MS- an intelligent way of selecting data by focusing on a subset of
ACCESS. It can handle data on: (a) soil classification; (b) soil variables or data samples, data cleaning and preprocessing,
horizon data; (c) source of data; and methods used for data reduction and projection, choosing the data mining task,
determining analytical data. Profile data in WISE3 originate choosing the data mining algorithm, the data mining step,
from over 260 different sources, both analogue and digital. interpreting mined patterns with possible return to any of the
Some 40% of the profiles were extracted from auxiliary previous steps and consolidating discovered knowledge.
datasets, including various Soil and Terrain (SOTER) The DM contains many study areas such as machine
databases and the FAO Soil Database (FAO-SDB), which, in learning, pattern recognition in data, databases, statistics,
turn, hold data collated from a wide range of sources. artificial intelligence, data acquisition for expert systems and
WISE3 holds selected attribute data for 10,253 soil profiles, data visualization. The most important goal here is to extract
with some 47,800 horizons, from 149 countries. Individual patterns from data and to bring useful knowledge into an
profiles have been sampled, described, and analyzed understandable form to the human observer. It is
according to methods and standards in use in the originating recommended that obtained information to be facile to
countries. There is no uniform set of properties for which all interpret for the easiness of use. The entire process aims to
profiles have analytical data, generally because only selected obtain highlevel data from low level data.
measurements were planned during the original surveys. Data mining involves fitting models to or determining
Methods used for laboratory determinations of specific soil patterns from observed data. The fitted models play the role of
properties vary between laboratories and over time. Some inferred knowledge. Typically, a data mining algorithm
times, results for the same property cannot be compared constitutes some combination of the following three
directly. WISE3 will inevitably include gaps, being a components.
compilation of legacy soil data derived from traditional soil The model: The function of the model (e.g.,
survey. These can be of a taxonomic, geographic, and soil classification, clustering) and its representational
analytical nature. As a result, the amount of data available for form (e.g. linear discriminants, neural networks).
modeling is some times much less than expected. Adroit use A model contains parameters that are to be
of the data, however, will permit a wide range of agricultural determined from the data.
and environmental applications at a global and continental The preference criterion: A basis for preference of
scale (1:500000 and broader) [44]. one model or set of parameters over another,
The analysis of these datasets with various data mining depending on the given data.
techniques may yield outcomes useful to researchers in future. The search algorithm: The specification of an
algorithm for finding particular models and
II. MATERIALS AND METHODS parameters, given the data, model(s), and a
The rapid growth of interest in data mining is due to the (i) preference criterion.
falling cost of large storage devices and increasing ease of
collecting data over networks, (ii) development of robust and A particular data mining algorithm is usually an
efficient machine learning algorithms to process this data, and instantiation of the model/preference/search components.
(iii) falling cost of computational power, enabling use of The more common model functions in current data mining
computationally intensive methods for data analysis [37]. practice include:
Data Mining (DM) represents a set of specific methods and
algorithms aimed solely at extracting patterns from raw data 1. Classification [41], [38], [42], [6], [39]: classifies a
[18]. The DM process has developed due to the immense data item into one of several predefined categorical
volume of data that must be handled easier in areas such as: classes.
business, medical industry, astronomy, genetics or banking 2. Regression [19], [12], [64], [45]: maps a data item to
field. Also, the success and the extraordinary development of a real valued prediction variable.
hardware technologies led to the big capacity of storage on 3. Clustering [61], [50], [47], [52], [29], [31], [62], and
harddisks, fact that challenged the appearance of many [21]: maps a data item into one of several clusters,
problems in manipulating immense volumes of data. Of where clusters are natural groupings of data items
course the most important aspect here is the fast growth of the based on similarity metrics or probability density
Internet. models.
The core of the DM process lies in applying methods and 4. Rule generation [60], [35], [40], [43], [23], [55],
algorithms in order to discover and extract patterns from [53], [67]: extracts classification rules from the data.
stored data but before this step data must be preprocessed. It 5. Discovering association rules [2], [63], [5], and [34]:
is well known that simple use of DM algorithms does not describes association relationship among different
produce good results. Thus, the overall process of finding attributes.
useful knowledge in raw data involves the sequential 6. Summarization [32], [65], [25], [20]: provides a
adhibition of the following steps: developing an understanding compact description for a subset of data.
of the application domain, creating a target dataset based on
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 424
7. Dependency modeling [22], [7]: describes significant predicts whether a new example falls into one category or the
dependencies among variables. other.
8. Sequence analysis [10], [33]: models sequential SVM is used to assess the spatiotemporal characteristics of
patterns, like time-series analysis. The goal is to the soil moisture products [4].
model the states of the process generating the Decision trees: The decision tree is one of the popular
sequence or to extract and report deviation and trends classification algorithms in current use in Data Mining and
over time. Machine Learning. Decision tree is a new field of machine
Though, there are lots of techniques available in the data learning which is involving the algorithmic acquisition of
mining, few methodologies such as Artificial Neural structured knowledge in forms such as concepts, decision
Networks, K nearest neighbor, K means approach, are popular trees and discrimination nets or production rules. Application
currently depends on the nature of the data. of data mining techniques on drought related data for drought
Artificial Neural Network: Artificial Neural Networks risk management shows the success on Advanced Geospatial
(ANN) is systems inspired by the research on human brain Decision Support System (GDSS). Leisa J Armstrong states
(Hammerstrom, 1993). Artificial Neural Networks (ANN) that data mining approach is one of the approaches used for
networks in which each node represents a neuron and each crop decision making.
link represents the way two neurons interact. Each neuron Research has been conducted in Australia to estimate a
performs very simple tasks, while the network representing of range of soil properties, including organic carbon (Henderson
the work of all its neurons is able to perform the more et al. 2001). The nation-wide database had 11,483 soil points
complex task. A neural network is an interconnected set of available to predict organic carbon in the soil. An enhanced
input/output units where each connection has a weight decision trees tool (Cubist), catering for continuous outputs
associated with it. The network learns by fine tuning the was used for this study. A correlation of up to 0.64 was
weights so as able to predict the call label of input samples obtained between the predicted and actual organic carbon
during testing phase. Artificial neural network is a new levels.
techniques used in flood forecast. The advantage of ANN K nearest neighbor: K nearest neighbor techniques is one
approach in modeling the rain fall and run off relationship of the classification techniques in data mining. It does not
over the conventional techniques flood forecast. Neural have any learning phase because it uses the training set every
network has several advantages over conventional method in time a classification performed. Nearest Neighbor search
computing. Any problem having more time for getting (NN) also known as proximity search, similarity search or
solution, ANN is highly suitable states that the neural network closest point search is an optimization problem for finding
method successfully predicts the pest attack incidences for one closest points in metric spaces.
week in advance. K nearest neighbor is applied for simulating daily
Pedotransfer functions (PTFs) provide an alternative by precipitation and other weather variables (Rajagopalan and
estimating soil parameters from more readily available soil Lall, 1999).
data. The two common methods used to develop PTFs are Bayesian networks: A Bayesian network is a graphical
multiple-linear regression method and ANN. Multiple linear model that encodes probabilistic relationships among
regression and neural network model (feed-forward back variables of interest. When used in conjunction with statistical
propagation network) were employed to develop a techniques, the graphical model has several advantages for
pedotransfer function for predicting soil parameters using data analysis. One, because the model encodes dependencies
easily measurable characteristics of clay, sand, silt, SP, Bd among all variables, it readily handles situations where some
and organic carbon[51]. data entries are missing. Two, a Bayesian network can be used
Artificial Neural Networks have been successful in the to learn causal relationships and hence can be used to gain
classification of other soil properties, such as dry land salinity understanding about a problem domain and to predict the
(Spencer et al. 2004). Due to their ability to solve complex or consequences of intervention. Three, because the model has
noisy problems, Artificial Neural Networks are considered to both a causal and probabilistic semantics, it is an ideal
be a suitable tool for a difficult problem such as the estimation representation for combining prior knowledge (which often
of organic carbon in soil. comes in causal form) and data. Four, Bayesian statistical
Support Vector Machines: Support Vector Machines methods in conjunction with Bayesian networks offer an
(SVM) is binary classifiers (Burges, 1998; Cortes and Vapnik, efficient and principled approach for avoiding the over fitting
1995). SVM is able to classify data samples in two disjoint of data Development of a data mining application for
classes. The basic idea behind is classifying the sample data agriculture based on Bayesian networks were studied by
into linearly separable. Support Vector Machines (SVMs) are Huang et al. (2008). According to him, Bayesian network is a
a set of related supervised learning methods used for powerful tool for dealing uncertainties and widely used in
classification and regression. In simple words given a set of agriculture datasets. He developed the model for agriculture
training examples, each marked as belonging to one of two application based on the Bayesian network learning method.
categories, an SVM training algorithm builds a model that The results indicate that Bayesian Networks are a feasible and
efficient.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 425
Bayesian approach improves hydrogeological site between their colony and a source of food. The original idea
characterization even when using low-resolution resistivity has since diversified to solve a wider class of numerical
surveys [52]. problems, and as a result, several problems have emerged,
K means approach: K means method is one of the most drawing on various aspects of the behavior of ants.
used clustering techniques in the data mining. The idea behind Ant Colony Optimization is applied for estimating
the K means algorithms is very simple that certain partition of unsaturated soil hydraulic parameters (K.C.Abbaspour et al,
the data in K clusters, the centers of the cluster can be ELSEVIER, 2001).
computed as the mean of the all sample belonging to a cluster. Particle Swarm Optimization: Particle Swarm
The center of the cluster can be considered as the Optimization (PSO) is a method for performing
representative of the cluster. The center is quite close to all numerical optimization without explicit knowledge of the
samples in the cluster. gradient of the problem to be optimized. PSO is originally
K Means approach was used to classify the soil and plants attributed to Kennedy, Eberhart, and Shri [28] [54] and was
(Camps-Valls et al., 2003). first intended for simulating social behavior. The algorithm
Fuzzy logic: Fuzzy logic is a form of multi valued logic was simplified and it was observed to be performing
derived from Fuzzy set theory to deal with reasoning that is optimization. The book by Kennedy and Eberhart
approximate rather than accurate. In contrast with "crisp [27] describes many philosophical aspects of PSO and swarm
logic", where binary sets have binary logic, fuzzy logic intelligence. An extensive survey of PSO applications is made
variables may have a truth value that ranges between 0 and 1 by Poli [48] [49].
and is not constrained to the two truth values of Particle Swarm Optimization is used for analysis of Soil
classic propositional logic [46]. Furthermore, when linguistic erosion characteristics (Li Yunkai et al, Springer, Sep.2009).
variables are used, these degrees may be managed by specific Simulated Annealing: Simulated Annealing (SA) is a
functions. Fuzzy logic emerged as a consequence of the 1965 generic probabilistic Meta heuristic for the global
proposal of Fuzzy set theory by Lotfi zadeh [1] [66]. Though optimization problem of applied mathematics, namely
fuzzy logic has been applied to many fields, from control locating a good approximation to the global optimum of a
theory to artificial intelligence, it still remains controversial given function in a large search space. It is often used when
among most statisticians, who prefer Bayesian logic, and the search space is discrete (e.g., all tours that visit a given set
some control engineers, who prefer traditional two-valued of cities). For certain problems, simulated annealing may be
logic. more effective than exhaustive enumeration provided that the
Fuzzy logic is used to the prediction of soil erosion in a goal is merely to find an acceptably good solution in a fixed
large watershed (B.Mitra et al., ScienceDirect, Nov.1998). amount of time, rather than the best possible solution. The
Genetic Algorithm: The Genetic Algorithm (GA) is method was independently described by Scott Kirkpatrick, C.
a search heuristic that mimics the process of natural evolution. Daniel Gelatt and Mario P. Vecchi in 1983 [30] and by Vlado
This heuristic is routinely used to generate useful solutions Cerny in 1985 [9]. The method is an adaptation of
to optimization and search problems. Genetic algorithms the Metropolis Hastings algorithm, a Monte Carlo method to
belong to the larger class of Evolutionary Algorithm (EA), generate sample states of a thermodynamic system, invented
which generates solutions to optimization problems using by N. Metropolis et al. in 1953 [36].
techniques inspired by natural evolution, such as inheritance, Simulated Annealing is used for analyzing Soil Properties
mutation, selection and crossover. (R.M. Lark et al., ScienceDirect, March, 2003).
Soil liquefaction is a type of ground failure related to
earthquakes. It takes place when the effective stress within
soil reaches zero as a result of an increase in pore water III. RESULTS AND DISCUSSION
pressure during earthquake vibration (Youd, 1992). Soil The purpose of the study is to examine the most effective
liquefaction can cause major damage to buildings, roads, techniques to extract new knowledge and information from
bridges, dams and lifeline systems, like the earthquakes. existing soil profile data contained within ISRIC-WISE soil
Genetic Algorithm approach is used for assessing the data set. Several data mining techniques are in agriculture and
liquefaction potential of sandy soils (G. Sen et al. Nat. allied area. Few of techniques are discussed here. K means
Hazards Earth Syst. Sci., 2010). method is used to forecast the pollution in the atmosphere
Ant Colony Optimization: The Ant Colony Optimization (Jorquera et al., 2001). Different possible changes of weather
(ACO) algorithm is probabilistic technique for solving are analyzed using SVM (Tripathi et al., 2006). K means
computational problems which can be reduced to finding good approach is used for classifying soil in combination with GPS
paths through graphs. This algorithm is a member of ant readings (Verheyen et al., 2001). Wine Fermentation process
colony algorithms family, in swarm intelligence methods, and monitored using data mining techniques. Taste sensors are
it constitutes some Meta heuristic optimizations. Initially used to obtain data from the fermentation process to be
proposed by Marco Dorigo in 1992 in his Ph.D. thesis [13] classified using ANNs (Riul et al., 2004).
[17], the first algorithm was aiming to search for an optimal A brief survey of the related work in the area of soil mining
path in a graph, based on the behavior of ants seeking a path is that the data involved here are high dimensional data and
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 426
dimensionality reduction was addressed in classical methods REFERENCES
such as Principal Component Analysis (PCA) [24]. There is a [1] Fuzzy Logic. Stanford Encyclopedia of Philosophy. Stanford
growing literature demonstrating the predictive capacity of the University. 2006-07-23. Retrieved 2008-09-29.
soil landscape paradigm using digital data and empirical [2] Agrawal R., Imielinski T., and Swami A., Mining association Rules
numerical modeling techniques as specified by Christopher et between sets of items in large databases, in Proceedings of 1993 ACM
SIGMOD International Conference on Management of Data,
al., [11]. The Eigen decomposition of empirical covariance (Washington D.C.), pp. 207-216, May 1993.
matrix is performed and the data points are linearly projected. [3] Alahakoon D., Halgamuge S.K., and Srinivasan B, Dynamic self
When the information relevant for classification is present in organizing maps with controlled growth for knowledge discovery,
eigenvectors associated with small eigenvalues are removed, IEEE Transactions on Neural Networks, vol. 11, pp. 601-614, 2000.
[4] Anish C. Turlapaty, Valentine Anantharaj, Nicolas H. Younan, Spatio-
then this could lead to degradation in classification accuracy. temporal consistency analysis of AMSR-E soil moisture data using
Examples of spatial prediction have been provided, across a wavelet-based feature extraction and one-class SVM, In the
range of physiographical range of environment and spatial Proceedings of the Annual Conference Baltimore, Maryland, March 9-
extents, for a number of soil properties by Gessler et al., [21] 13, 2009.
[5] Au W. H. and Chan K. C. C., An effective algorithm for discovering
Tenenbaum et al.,[59] introduced the concept of Isomap, a fuzzy rules in relational databases, in Proceedings of IEEE
global dimensionality reduction algorithm. The CCDR International Conference on Fuzzy Systems FUZZ IEEE 98, (Alaska),
(classification constrained dimensionality reduction) pp. 1314-1319, May 1998.
algorithm [15] was only demonstrated for two classes and the [6] Banerjee M, Mitra S, and Pal S.K, Rough fuzzy MLP: Knowledge
encoding and classification, IEEE Transactions on Neural Networks,
performance was analyzed for simulated data. Bui et al., [8] vol. 9, pp. 1203-1216, 1998.
demonstrated the potential for the discovery of knowledge [7] Bosc P., Pivert O., and Ughetto L., Database mining for the discovery
embedded in survey of landscape model using rule induction of extended functional dependencies, in Proceedings of NAFIPS 99,
techniques based on decision trees. It has the ability to mimic (New York, USA), pp. 580-584, June 1999.
[8] Bui E. N., Loughhead A. and Comer R... Extracting Soil Landscape
soil map using samples taken from it, and by implication it Rules from Previous Soil Surveys. Australian Journal of Soil Science,
also captures the embedded knowledge. Related to agriculture, 37:495508, 1999.
many countries are still facing a multitude of problems to [9] Cerny V., A thermo dynamical approach to the traveling salesman
maximize productivity [26]. Another concept of CCDR plots problem: an efficient simulation algorithm. Journal of Optimization
Theory and Applications, 45:41-51, 1985.
the classification error probability and its confidence interval [10] Chiang D. A., Chow L. R., and Wang Y. F., Mining time series data by
using K nearest neighbour classifier [14]. Normally there is a a fuzzy linguistic summary system, Fuzzy Sets and Systems, vol. 112,
decrease in error probability as dimension increases, and the pp. 419-432, 2000.
optimal value is reached when dimension value varies [11] Christopher J. Moran and Elsabeth Bui N., Spatial Data Mining for
Enhanced Soil Map Modeling. In the Proceedings of the International
between 12 - 14, which has been proved using entropic graph Journal of Geographical Information Science, 2002.
algorithm. However the food production has improved [12] Ciesielski V and Palstra G, Using a hybrid neural/expert system for
significantly during last two decades by providing it with database mining in market survey data, in Proc. Second International
good seeds, fertilizers, and pesticides and modern farming Conference on Knowledge Discovery and Data Mining (KDD-96),
(Portland, OR), p. 38, AAAI Press, Aug. 2-4, 1996.
equipment [57]. The agriculture sector has seen a tremendous [13] Colorni A., Dorigo et M., Maniezzo V., Distributed Optimization by
improvement. Ant Colonies, actes de la premire conference euro penne sur la vie
artificielle, Paris, France, Elsevier Publishing, 134-142, 1991.
IV. CONCLUSIONS [14] Costa A. and Hero A. O. Geodesic Entropic Graphs for Dimension and
Entropy Estimation in Manifold Learning. In the Proceedings of IEEE
In this research survey, data mining and pattern recognition Transaction Signal Processing, volume 52, pages 2210-2221, 2004.
techniques for soil data mining studied. The survey aims to [15] Costa J. A. and Hero A. O., III. Classification Constrained
come out of the techniques being used in the agricultural soil Dimensionality Reduction. In IEEE International Conference on
Acoustic Speech, and Signal Processing, volume 5, pages 1077-1080,
science and its allied area. March 2005.
The recommendations arising from this research survey are: [16] Cunningham S. J and Holmes G. Developing innovative applications in
A comparison of different data mining techniques could agriculture using data mining, In the Proceedings of the Southeast Asia
produce an efficient algorithm for soil classification for regional Computer Confederation Conference, 1999.
[17] Dorigo M., Optimization, Learning and Natural Algorithms, PhD
multiple classes. The benefits of a greater understanding of thesis, Politecnico di Milano, Italie, 1992.
soils could improve productivity in farming, maintain [18] Fayadd, U., PiateskyShapiro, G., and Smyth, P, Data Mining to
biodiversity, reduce reliance on fertilizers and create a better Knowledge Discovery in Databases, AAAI Press / the MIT Press,
integrated soil management system for both the private and Massachusetts Institute of Technology. ISBN 0262560976 Fayap,
1996.
public sectors. [19] Fayyad U.M, Piatetsky-Shapiro G, Smyth P., and Uthurusamy R., eds.,
Advances in Knowledge Discovery and Data Mining. Menlo Park, CA:
ACKNOWLEDGMENT AAAI/MIT Press, 1996.
[20] George R. and Srikanth R., Data summarization using genetic
The authors would like to thank the editor and the algorithms and fuzzy logic, in Genetic Algorithms and Soft Computing
anonymous reviewers for their valuable comments and (F. Herrera and J. L. Verdegay, eds.), pp. 599-611, Heidelberg:
suggestions. Springer-Verlag, 1996.
. [21] Gessler P. E., Moore D., McKenzie N. J. and Ryan P... Soil Landscape
Modeling and Spatial Prediction of Soil Attributes. In the Proceedings
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 427
of the International Journal of Geographical Information Systems, [45] Noda E, Freitas A.A, and Lopes H.S, Discovering Interesting
volume 9, pages 421-432, 1995. prediction rules with a genetic algorithm, in Proceedings of IEEE
[22] Hale J. and Shenoi S., Analyzing FD inference in relational databases, Congress on Evolutionary Computation CEC 99, (Washington DC),
Data and Knowledge Engineering, vol. 18, pp. 167-183, 1996. pp. 1322-1329, July 1999.
[23] Hu X. and Cercone N., Mining knowledge rules from databases: A [46] Novak, V., Perfilieva, I. and Mockor, J. Mathematical principles of
rough set approach, in Proceedings of the 12th International Conference fuzzy logic Dodrecht: Kluwer Academic. ISBN 0-7923-8595-0, 1999.
on Data Engineering, (Washington), pp. 96-105, IEEE Computer [47] Pedrycz W, Conditional fuzzy c-means, Pattern Recognition Letters,
Society, Feb. 1996. vol. 17, pp. 625-632, 1996.
[24] Jain A. K. and Dubes R. C., Algorithm for Clustering Data. Prentice [48] Poli, R. An analysis of publications on Particle swarm optimization
Hall, 1998. applications. Technical Report CSM-469 (Department of Computer
[25] Kacprzyk J. and Zadrozny S., Data mining via linguistic summaries of Science, University of Essex, UK)., 2007
data: an interactive approach, in Proceedings of IIZUKA 98, (Fukuoka, [49] Poli, R. Analysis of the publications on the Applications of particle
Japan), pp. 668-671, October 1998. swarm optimization. Journal of Artificial Evolution and Applications:
[26] Katyal J. C., Paroda R. S., Reddy M. N., Aupam Varma and N. 110. Doi:10.1155/2008/685175., 2008
Hanumanta Rao. Agricultural Scientists Perception on Indian [50] Russell S and Lodwick W, Fuzzy clustering in data mining for telco
Agriculture: Scene Scenario and Vision. National Academy of database marketing campaigns, in Proceedings of NAFIPS 99, (New
Agricultural Science, 2000. York), pp. 720-726, June 1999.
[27] Kennedy J. Eberhart R.C. Swarm Intelligence. Morgan [51] Sarmadian F., Taghizadeh R., Mehrjardi and. Akbarzadeh A,
Kaufmann. ISBN 1-55860-595-9., 2001 Optimization of Pedotransfer Functions Using an Artificial Neural
[28] Kennedy, J., Eberhart, R., "Particle Swarm Optimization". Proceedings Network, Australian Journal of Basic and Applied Sciences, 3(1): 323-
of IEEE International Conference on Neural Networks. IV. pp. 1942 329, ISSN 1991-8178., 2009,
1948, 1995. [52] Shalvi D and De Claris N, Unsupervised neural network approach to
[29] Kiem H. and Phuc D., Using rough genetic and Kohonen's Neural medical data mining techniques, in Proceedings of IEEE International
network for conceptual cluster discovery in data mining, in Joint Conference on Neural Networks, (Alaska), pp. 171-176, May
Proceedings of RSFDGrC'99, (Yamaguchi, Japan), pp. 448-452, 1998.
November 1999. [53] Shan N. and Ziarko W., Data-based acquisition and incremental
[30] Kirkpatrick S., Gelatt C.D, Vecchi M.P. Optimization by Simulated modification of classification rules, Computational Intelligence, vol.
Annealing. Science New Series 220 (4598):671 11, pp. 357-370, 1995.
680.Doi:10.1126/science.220.4598.671. ISSN 00368075. , 1983-05-13 [54] Shi, Y. Eberhart, R.C., A modified particle swarm
[31] Kohonen, Kaski S., Lagus K., Salojarvi J., Honkela J., Paatero V., and optimizer". Proceedings of IEEE International Conference on
Saarela A., Self organization of a massive document collection, IEEE Evolutionary Computation. pp. 6973., 1998
Transactions on Neural Networks, vol. 11, pp. 574-585, 2000. [55] Skowron A., Extracting laws from decision tables - a rough set
[32] Lee D. H. and Kim M. H, Database summarization using fuzzy ISA approach, Computational Intelligence, vol. 11, pp. 371-388, 1995.
hierarchies, IEEE Transactions on Systems Man and Cybernetics. Part [56] Souheil Ezzedine, Yoram Rubin, and Jinsong Chen, Bayesian method
B-Cybernetics, vol. 27, pp. 68-78, 1997. for hydro geological site characterization using borehole and
[33] Lee R. S. T. and Liu J. N. K., Tropical cyclone identification and geophysical survey data: Theory and application to the Lawrence
tracking system using integrated neural oscillatory leastic graph Livermore National Laboratory Superfund site, Water Resources
matching and hybrid RBF network track mining techniques, IEEE Research, vol. 35, No. 9, Pages 26712683, September, 1999.
Transactions on Neural Networks, vol. 11, pp. 680-689, 2000. [57] Subba Rao. Indian Agriculture past Laurels and Future Challenges,
[34] Lopes C., Pacheco M., Vellasco M., and Passos E., Rule evolver: An Indian Agriculture: Current Status, Prospects and Challenges.
evolutionary approach for data mining, in Proceedings of Convention of Indian Agricultural Universities Association, 27:58-77,
RSFDGrC'99, (Yamaguchi, Japan), pp. 458-462, November 1999. December 2002.
[35] Lu H.J., Setiono R., and Liu H., Effective data mining using neural [58] Sudarshan Reddy S, Vedantha S, Venkateshwar Rao B, Sundar Ram
networks, IEEE Transactions on Knowledge and Data Engineering, Reddy and Venkat Reddy. Gathering Agrarian Crisis Farmers Suicides
vol. 8, pp. 957-961, 1996. in Warangal district. Citizens Report, 1998.
[36] Metropolis N., Rosenbluth A.W., Rosenbluth M.N., Teller A.H. and [59] Tenenbaum J. B., De Silva and Langford C., A Global Geometric
Teller E... Equations of State Calculations by Fast Computing Framework for Dimensionality Reduction. 290(5500):2319-2323,
Machines. Journal of Chemical Physics, 21(6):1087-1092, 1953. 2000.
[37] Mitchell T.M, Machine learning and data mining, Communications of [60] Tickle A.B, Andrews R., Golea M., and Diederich J., The truth will
the ACM, vol. 42, no. 11, 1999. come to light: Directions and challenges in extracting the knowledge
[38] Mitra S and Pal S.K, Fuzzy self organization, inferencing and rule embedded within trained artificial neural networks, IEEE Transactions
generation, IEEE Transactions on Systems, Man and Cybernetics, Part on Neural Networks, vol. 9, pp. 1057-1068, 1998.
A: Systems and Humans, vol. 26, pp. 608-620, 1996. [61] Turksen I.B, Fuzzy data mining and expert system Development, in
[39] Mitra S, Mitra P, and Pal S.K, Evolutionary modular Design of rough Proceedings of IEEE International Conference on Systems, Man, and
knowledge-based network using fuzzy attributes Neuro computing, vol. Cybernetics, (San Diego, CA), pp. 2057-2061, October 1998.
36, pp. 45-66, 2001. [62] Vesanto J. and Alhoniemi E., Clustering of the self organizing map,
[40] Mitra S. and Hayashi Y., Neuro-fuzzy rule generation: Survey in soft IEEE Transactions on Neural Networks, vol. 11, pp. 586-600, 2000.
computing framework, IEEE Transactions on Neural Networks, vol. [63] Wei Q. and Chen G., Mining generalized association rules with fuzzy
11, pp. 748-768, 2000. taxonomic structures, in Proceedings of NAFIPS 99, (New York), pp.
[41] Mitra S. and Pal S.K, Fuzzy multi-layer perceptron, Inferencing and 477-481, June 1999.
rule generation, IEEE Transactions on Neural Networks, vol. 6, pp. 51- [64] Xu K, Wang Z, and Leung K.S, Using a new type of non Linear
63, 1995. integral for multi-regression: an application of evolutionary Algorithms
[42] Mitra S., De R.K, and Pal S.K, Knowledge-based fuzzy MLP for in data mining, in Proceedings of IEEE International Conference on
classification and rule generation, IEEE Transactions on Neural Systems, Man, and Cybernetics, (San Diego, CA), pp. 2326-2331,
Networks, vol. 8, pp. 1338-1350, 1997. October 1998.
[43] Mollestad T. and Skowron A., A rough set framework for data mining [65] Yager R. R., On linguistic summaries of data, in Knowledge Discovery
of propositional default rules, Lecture Notes in Computer Science, in Databases (W. Frawley and G. Piatetsky-Shapiro, eds.), pp. 347-363,
vol. 1079, pp. 448-457, 1996. Menlo Park, CA: AAAI/MIT Press, 1991.
[44] Niels H. Batjes, ISRIC-WISE Harmonized Global Soil Profile Dataset [66] Zadeh L.A... "Fuzzy sets", Information and Control 8 (3): 338353,
(Ver. 3.1) - A Report -2008/2 1965
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 428
[67] Zhang Y.Q., Fraser M.D., Gagliano R.A., and Kandel A., Granular
neural networks for numerical-linguistic data fusion and knowledge
discovery, IEEE Transactions on Neural Networks, vol.11, pp. 658-
667,2000.
AUTHORS BIOGRAPHY
2
Department of Computer Engineering, M. M. University,
Mullana (Ambala) 133207, India
3
Department of Computer Engineering, M. M. University,
Mullana (Ambala) 133207, India
4
Department of Computer Engineering, M. M. University,
Mullana (Ambala) 133207, India
Abstract
1.1. Scalability and Reliability
This paper presents a model for reliable packet delivery in Network reliability and scalability are closely coupled and
Wireless Sensor Networks based on Discrete Parameter Markov typically they act against each other. In other words, it is
Chain with absorbing state. We have demonstrated the
very difficult to build a reliable ad hoc network as the
comparison between cooperative and non cooperative automatic
repeat request (ARQ) techniques with the suitable examples in
number of nodes increases [7]. This is due to network
terms of reliability and delay in packet transmission. overhead that comes with increased size of network. In ad
Keywords: Reliability, Absorbing State, Wireless Sensor hoc network, there is no predefined topology or shape.
Network,Markovchain. Therefore, any node wishing to communicate with other
nodes should generate more control packets than data
packets. Moreover, as network size increases, there is
1. Introduction more risk that communication links get broken, which will
end up with creating more control packets. In summary,
Wireless sensor networks (WSNs) [1][2] are the topic of more overhead is unavoidable in a larger scale wireless
intense academic and industrial studies. Research is sensor network to keep the communication path intact.
mainly focused on energy saving schemes to increase the
lifetime of these networks [4][5]. There is an exciting new
1.2. Reliability and power efficiency
wave in sensor applications-wireless sensor networking- Power efficiency also plays a very important role in this
which enables sensors and actuators to be deployed complex equation. To design a low power wireless sensor
independent of costs and physical constraints of wiring. network, the duty cycle of each node needs to be reduced.
For a wireless sensor network to deliver real world The drawback is that as the node stays longer in sleep
benefits, it must support the following requirements in mode [3] to save the power, there is less probability that
deployment: scalability, reliability, responsiveness, power the node can communicate with its neighbors and may also
efficiency and mobility. lower the reliability due to lack of exchange of control
The complex inter-relationships between these packets and delays in the packet delivery.
characteristics are a balance; if they are not managed
properly, the network can suffer from overhead that 1.3. Reliability and responsiveness
negates its applicability. In order to ensure that the Ability of the network to adapt quickly the changes in the
network supports the applications requirements, it is topology is known as responsiveness. For better
important to understand how each of these characteristics responsiveness, there should be more issue and exchange
affects the reliability. of control packets in ad hoc network, which will naturally
result in less reliability.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 430
1.4. Mobility and reliability It Jt Kt Lt
immediate and guaranteed action; for example medical Fig.1 Network structure with Linear Multi-hop path
emergency alarm, fire alarm detection, instruction
detection [6]. In these situations packets has to be When a packet is transmitted, it can be forwarded towards
transported in a reliable way and in time through the the destination by only those nodes which are closer to the
sensor network. Thus, besides the energy consumption, destination, then the transmitter.
delay and data reliability becomes very relevant for the 2.1 Discrete Parameter Markov Chain with
proper functioning of the network.
Direct communication between any node and sink could Absorbing State
be subject only to just a small delay, if the distance Packet transfer from source to destination via intermediate
between the source and the destination is short, but it forwarders can be treated as a state diagram of discrete
suffers an important energy wasting when the distance parameter Markov chain with absorbing state. An
increases. Therefore often mutihop short range absorbing state is a state from which there is zero
communications through other sensor nodes, acting as probability of exiting. An absorbing Markov system is a
intermediate relay, are preferred in order to reduce the Markov system that contains at least one absorbing state,
energy consumption in the network. In such a scenario it is and is such that it is possible to get from each non
necessary to define efficient technique that can ensure absorbing state to some absorbing state in one or more
reliable communication with very tight delay constraint. In time steps. Consider p be the probability of successful
this work we focus attention on the control of data and transmission of a packet to an intermediate relay node
reliability in multihop scenario. inside the coverage range. Therefore 1-p will be the
A simple implementation of ARQ is represented by the probability of unsuccessful transmission of packet.
Stop and Wait technique that consists in waiting the For each; node n, the probability to correctly deliver a
acknowledgement of each transmitted packet before packet to a node that is Rt links distant is equal to p. So the
transmitting the next one, and retransmit the same packet probability that the packet is not correctly received by this
in case it is lost or wrongly, received by destination [8]. node (1 p), while it is correctly received from the
We extend here this analysis by introducing the immediately previous node with a probability p; so with a
investigation of the delay required by the reliable data probability (1 p) p the packet will be forwarded by the
delivery task. To this aim we investigate the delay previous node. If also this node has not correctly received
required by a cooperative ARQ mechanism to correctly the packet send by node n, event that occur with a
deliver a packet through a multihop linear path from a probability (1- p)2, with a probability (1 p )2 p the packet
source node to the sink. In particular we analyze the delay will be forwarded by the node previous to previous. If
and the coverage range of the nodes in the path, therefore none of the node in the coverage area of the transmitter
the relation between delay and the number of cooperative receives a correct packet it is necessary to ask the
relays included in the forwarding process. retransmission of the packet by the source node. It is
possible to describe the process concerning one data
packet forwarding from the source node n = 1 to the
2. System Model destination n = N with a discrete time Markov chain with
absorbing state. Packet transmitted by a node will be
Fig. 1 shows the network structure with linear multihop further forwarded by a node in the coverage range of the
path consist of source node (node n =1), destination (node transmitter which is furthest node from the source and has
n = N) and (N-2)*t intermediate relay nodes deployed at correctly received the packet.
equal distance where t is the number of parallel path of
intermediate relay nodes between source and destination.
Each path is composed by Z = N 1 links. Suppose that
all the nodes have circular radio coverage with the same
transmission range Rt. When a sensor transmits a packet,
it is received by all the sensors in a listen state inside the
Fig 2 Packet transmission in Cooperative ARQ as a discrete parameter
coverage area of the sender. Markov Chain with absorbing state
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 431
#
Department of Computer Science, Karachi University
Karachi, Pakistan
cellular networks [6]. Computer networks were originally a) There is large per-packet overhead imposed by
developed with data transmission in mind, but the needs of WiFi for each VoIP packet for both protocol
Internet users today are diverse; no longer is the need for headers and WiFi contention.
transmitting only data traffic over the Internet but there is b) Design of 802.11 protocols is such that it allows
also need to make VoIP calls, play online games and clients to access the channel in a distributed
watch streaming media. Indeed, voice over the Internet manner which causes a contention for the
Protocol (VoIP) is growing rapidly and is expected to do network which is particularly evident in the case
so for the near future. A new and powerful development of VoIP due to the real-time nature of the traffic.
for data communications is the emergence of wireless
local area networks (WLANs) in the embodiment of the Hence in the case of VoIP over WLAN the perceived
802.11 a, b, g standards [7, 8], collectively referred to as throughput and real throughput have a large difference.
Wi-Fi [8]. Because of the proliferation and expected Even though it does seem as an attractive alternative to
expansion of Wi- Fi networks, considerable attention is cellular wireless telephony it has several drawbacks as we
now being turned to voice over Wi-Fi, with some shall further investigate in section 4 of this paper.
companies already offering proprietary networks,
handsets, and solutions. However deployment of VoIP
over WiFi poses some serious problems and concerns. 2.2 VoIP on IEEE 802.16
This is the main reason why the shift is now towards
WiMax. IEEE 802.16 [11] is the de facto standard for broadband
wireless communication. It is considered as the missing
In this paper we take up a comparative study based on link for the last mile connection in Wireless
measurement analysis of simulated packet traces. The Metropolitan Area Networks (WMAN). It represents a
results are compared to see which option is more viable: serious alternative to the wired network, such as DSL and
VoIP over WiFi or VoIP over WiMax. cable modem. Besides Quality of Service (QoS) support,
the IEEE 802.16 standard is currently offering a nominal
data rate up to 100 Mega Bit Per Second (Mbps), and a
2.1 VoIP Issues on IEEE 802.11 covering area around 50 kilometers. Thus, a deployment
of multimedia services such as Voice over IP (VoIP),
Wireless Local Area Networks (WLANs) are increasingly Video on Demand (VoD) and video conferencing is now
making their way into residential, commercial, industrial possible, which will open new markets and business
and public areas. As VoIP applications flourish [2] voice opportunities for vendors and service providers.
will be a significant driver for widespread adoption and Concerning QoS support, the 802.16 standard proposes to
integration of WLAN. As such voice capacity of a classify, at the MAC layer, the applications according to
WLAN, which is defined as the maximum number of their QoS service requirement (real time applications with
voice connections that can be supported with satisfied stringent delay requirement, best effort applications with
quality, has been investigated in the literature [9, 10]. The minimum guaranteed bandwidth) as well as their packet
capacity of G.711 VoIP using constant bit rate (CBR) arrival pattern (fixed / variable data packets at periodic /
model and a 10 ms packetization interval is 6 calls. The aperiodic intervals). For this aim, the initial standard
two main problems encountered when VoIP is used over proposes four classes of traffic, and the 802.16e [11]
WiFi are: amendment adds another class:
The system capacity for voice can be quite low Unsolicited grant service (UGS): supports
for WLAN. Constant Bit Rate (CBR) services, such as T1/E1
VoIP traffic and traditional data traffic such as emulation and VoIP without silence suppression.
Web traffic, emails etc. can mingle with each Real-time polling service (rtPS): supports real-
other thereby bringing down VoIP performance. time services with variable size data on a
periodic basis, such as MPEG and VoIP with
silence suppression.
Extended rtPS : recently introduced by the
These problems exist mainly due to the following 802.16e standard, it combines UGS and rtPS.
reasons: That is, it guaranties periodic unsolicited grants,
but the grantsize can be changed by request. It
was speciallyintroduced to support VoIP traffics
[11].
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 435
Non Real-Time Polling service (nrtPS): supports Hence in the ns2 simulation the VoIP packets have been
non real-time services that require variable size modeled through CBR UDP with a data rate of 80 bytes
data bursts on regular basis, such as File and a delay of 20 milliseconds which is typical
Transport Pro- tocol (FTP) service. specification for G.711 codec [14].
Best effort (BE): for applications that do not In the case of the 802.11 scenario two TCP flows are set
requireQoS such as Hyper Text Transfer up: one from node N0 to wired node W0 (it is run from 5
Protocol (HTTP). seconds to end of simulation) and the other from wired
node W1 to node N2 (it is run from 15 seconds to end of
Due to the above-mentioned QoS implementations on simulation). The VoIP packets are sent from node N0 to
IEEE 802.16 VoIP performs better on WiMax as we shall wired node W0 and from N2 to wired node W1. There are
see in the next section. 16 VoIP flows instantiated simultaneously between N0
and W0 and their start time is 40 seconds, two of them are
stopped at 100 second while remaining two at 120
3. Experimental Setup seconds. Between N2 and W1 there are 4 simultaneous
VoIP sessions with start times 100 seconds and ending
To investigate performance of VoIP with TCP on IEEE times of the 4 are 120 seconds for first two, 140 seconds
802.11 and IEEE 802.16 simulations were undertaken for third and 150 seconds for the last one.
using TCP flows along with CBR flows (defined on top of In the case of 802.16 scenario the same example as the one
UDP flows). UDP was used for the VoIP data flow and provided by NS2 Simulator for IEEE 802.16 network [15]
the UDP packet properties were those of the G.711 codec has been used and the topology for it has been shown in
[13]. Figure 1. In the case of the 802.16 scenario three TCP
Figure 1 shows the simulation setup in ns2. In this flows are set up: one from node N0, node N2 and node N3
network both VoIP and TCP/IP data traffic will be used to to wired node W1. Their start times are 0.1, 0.2 and 0.3
test the network performance for VoIP. seconds and they stop when simulation ends; the VoIP
The setup is composed of two wired nodes, three mobile packets are sent from node N0 to wired node W1. There
nodes and a base station serving as the access point for the are 8 VoIP flows instantiated simultaneously between N0
WiFi network in case of Experiment 1 and for the WiMax and W1 and their start time is 40 seconds out of which two
network in case of Experiment 2. In both the experiments are stopped at 60 seconds and remaining are allowed to
the deployment of the network was kept the same but the run till the end of the simulation.
TCP and VoIP flows were varied each time.
Also the number of flows was varied: the simulation part
was done with ns2 whereas for analysis purposes the 4. Experimental Results
Linux utilities xgraph and gnuplot were used.
This section presents the results for the two experiments.
We plotted graphs for throughput, jitter and packet losses
in both cases.
4. Conclusions
All our findings complement the characteristics of both the
networks and help in further establishing the fact that
WiMax is better suited to VoIP than WiFi.
References
[1] D. P. Hole and F. A. Tobagi, Capacity of an IEEE 802.11b
wireless LAN supporting VoIP, in Proc. IEEE ICC, Jun.
2004, vol. 1, pp. 196201.Goode B., September 2002. Voice
Over Internet Protocol (VoIP). Invited Paper. Proceedings of
the IEEE, Vol. 90, no. 9.
[2] Skype: http://www.skype.com
Network Simulator 2 http://www.isi.edu/nsnam/ns/
[3] B. Teitelbaum, "Leading-edge voice communications for the
MITC," Sept. 12, 2003 at http://people.internet2.edu/~ben/.
[4] Forman, G. 2003. An extensive empirical study of feature
selection metrics for text classification. J. Mach. Learn. Res.
3 (Mar. 2003), 1289-1305.
[5] J. C. Bellamy, Digital Telephony, John Wiley & Sons, 2000.
[6] T.S. Rappaport, Wireless Communications: Principles and
Practice, Prentice Hall, second edition, 2002.
[7] ISO/IEC and IEEE Draft International Standards,"Part 11:
Wireless LAN Medium Access Control (MAC) an dPhysical
Layer (PHY) Specifications," ISO/IEC 8802-11, IEEE
P802.11/D10, Jan. 1999.
[8] http://wi-fiplanet.webopedia.com/TERM/w/Wi_Fi.html
[9] F. Anjum, M. Elaoud, D. Famolari, A. Ghosh, R.
Vaidyanathan, A. Dutta, P. Agrawal, T. Kodama, and Y.
Katsube. Voice performance in WLAN networks-an
experimental study. Global Telecommunications Conference,
2003. GLOBECOM03. IEEE, 6, 2003.
[10] S. Garg and M. Kappes. An experimental study of
throughput for udp and voip traffic in ieee 802.11b networks.
Wireless Communications and Networking, 2003. WCNC
2003. 2003 IEEE, 3:17481753 vol.3, 16-20 March 2003.
[11] IEEE standard for local and metropolitan area networks,
Part 16: Air Interface for fixed broadband wireless access
systems, IEEE Standard 802.16, October 2004.
[12] Goode B., September 2002. Voice Over Internet Protocol
(VoIP). Invited Paper. Proceedings of the IEEE, Vol. 90, no.
9.
[13] Voice over IP Per Call Bandwidth Consumption
http://www.cisco.com/en/US/tech/tk652/tk698/technologies_
tech_note09186a0080094ae2.shtml
[14]
http://www.cisco.com/en/US/tech/tk652/tk698/technologies_
tech_note09186a0080094ae2.shtml
[15] http://cnlab.kaist.ac.kr
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 438
#
Department of Computer Science, Karachi University
Karachi, Pakistan
Abstract IBook and what sets it apart from other similar works are
Over the years the World Wide Web has seen a major its additional features of interactive multimedia content
transformation with dynamic content and interactivity being facilitating effective learning with semantic technologies
delivered through Web 2.0 and provision of meaning to Web i.e. XML[7].
content through the Semantic Web. Web 2.0 has given rise to IBook is an application that takes an innovative approach
special methods of eLearning; we believe that interactive
multimedia and semantic technologies applied together can
for eLearning which lies in both domains: Web 2.0 and the
further enable effective reuse of such applications thereby taking Semantic Web. Linear text was challenged by the world of
eLearning a step further. As proof of this idea we present IBook the Internet which led to the creation of hypertext but even
which is an eLearning application that uses concepts from both that suffered some drawbacks which led to the concept of
the fields of Web 2.0 and Semantic Web. It presents multimedia hypermedia [9]. The students of today have done away
in a form that enhances the users learning experience through with books and look to the Internet to support their
the use of Web 2.0 and Semantic Web. learning. A widespread argument now exists among
teachers, educators and psychologists that advanced
Keywords: Web 2.0, Semantic Web, Multimedia, eLearning. comprehension is acquired through interacting with the
content [8] and this is the fundamental motivation behind
IBook. We feel that semantically connected data in
1. Introduction multiple dimensions can bring a remarkable change in the
learning curve and experience and this is where IBook
With a proliferation of Web 2.0 services and applications
plays its role.
there has been a major paradigm shift in the way we
As is clear from the name IBook is an interactive,
envision the World Wide Web [3, 4]. We have witnessed multimedia based book which provides the reader with
an evolution of the Web from the first generation to the additional forms of presentations for enhanced delivery of
third generation [1, 2] and at present we live somewhere the books contents. Moreover the book not only follows
between the age of second generation and third generation its classical front view but also possesses great details to
Web content. This age can be termed as a transition explain it further by adding relevant video content as well
stage between Web 2.0 [3, 4] and the envisioned Semantic as voice over feature to retain readers attention to the
Web [5] and in this transition phase there has been a most. Hence IBook is an advanced multimedia platform for
realization of new concepts such as e-Science, e-Education, eLearning. With IBook the educator can add flexibility and
e-Learning, e-Commerce, e-Government etc. easy adaptation to new and changing user requirements
The realization of these new technologies has given birth to through support for a reusable metadata structure.
new forms of multimedia in the World Wide Web and this
is in particular the case with eLearning [6] with many
adaptive hypermedia learning applications being
developed. This paper also presents one such application
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 439
The remainder of this paper is organized as follows: section with the help of formal ontologies for structuring of the
II explains the necessary background with respect to the underlying data for the purpose of comprehensiveness and
generations of Web content, section III explains in detail machine understanding. The Semantic Web is an
the IBook features and functionalities with illustrations. extension of the current Web in which information is
Section IV presents the architecture and implementation given well-defined meaning, enabling computers and
details of the IBook framework with an overview of how people to work in co-operation.
semantic technologies are incorporated into it. Section V
concludes the paper with a discussion of possible future
works. 2.3 Integration of Web 2.0 with the Semantic Web
Earlier when O'Reilly Media and MediaLive hosted the
2. Background first Web 2.0 conference in 2004 and the term Web 2.0
was used the inventor of the World Wide Web Sir Tim-
As mentioned in section I IBook is an application from the Berners Lee discarded it as being a buzzword or piece
areas of Web 2.0 and Semantic Web and this section of jargon but recently some researchers have presented a
provides a brief overview of each of these areas. different viewpoint [11, 12, 13]. Researchers are now
Some researchers characterize the Web evolution in terms talking about a merger of the two ideas of Web 2.0 and the
of generations with the first generation containing static Semantic Web and are now upholding the belief that the
HTML content [1, 2] which was and is still being replaced two fields are complementary rather than competing with
by dynamic, on-the-fly Web content giving rise to the goals being in harmony and each bringing its own strength
second generation of Web technologies and applications into the picture [11]. This is also the line of reasoning we
[4]. Second generation Web technology mainly focused on follow in this paper and advocate the idea of integration of
addressing needs of humans but in contrast third Web 2.0 technologies with the Semantic Web ideas for
generation Web technology is more focused at making effective methods of eLearning.
content that is machine-processable.
navigating into the book for content and gives the user an These quizzes can be user-defined and how this is
extra level of interactivity which closely mimics the real- achieved is explained in detail in the next section.
world book as shown in the table of contents view shown
in Figure 1. When the user clicks a particular chapter for
viewing he is presented with the view shown in Figure 2.
Here the reader is not only able to read the chapters
contents but can also listen to it with voice over feature: as
soon as chapter opens the text of the chapter is played with
the portion that is being played highlighted in yellow. The
voice over facility is what makes IBook particularly unique
and sets it apart from other works in the eLearning domain:
this is the first such work which gives user an extra level of
multimedia interactivity with voice over capability thereby
being able to grasp his attention towards the content of the
book. The reader is also given the capability to stop or
pause the audio at any point thereby adding interactivity to
the reading/listening process. Navigation features are also
included within each chapter of IBook while the reader
browses through the book.
Figure 2 IBook Chapter View
2
Head of MSc. Software Systems,
Karpagam University, Coimbatore, India
hospitals and use of mobile devices like PDA, updating the paperwork at the bedside of the
smart phones, design of health care management patient, it is not always accurate, because
systems etc., [4]. The emerging RFID this is handwritten.
technology is rapidly becoming the standard for
tracking inventory, identifying patients, and In thousands of hospitals across the
managing personnel in hospitals [7]. In hospitals world, blood transfusion is an everyday
patient safety is critically important; lives are at business, but fraught with risks. This is
stake, and zero defects should be the established because contaminated blood may be
standard. At the same time, hospitals are transfused to a healthy patient or receiving
pressured to reduce costs. Therefore, when wrong type of blood. Data from US
developing strategic objectives, technologies that hospitals shows an alarming number of
reduce operating expenses while providing cases of medical negligence or mistakes,
increased patient safety must be thoroughly many of which are related to blood
tested and evaluated. Radio frequency transfusion.
identification (RFID) is one technology that
holds great promise. Many health professionals are
concerned about the growing number of
patients who are misidentified before, during
Recent years, in almost every country in the or after medical treatment. Indeed, patient
world, substantial financial resources have been identification error may lead to improper
allocated to the health care sector. Technological dosage of medication to patient, as well as
development and modern medicine practices are having invasive procedure done. Other
amongst the outstanding factors triggering this related patient identification errors could
shift. Achieving a high operational efficiency in lead to inaccurate lab work and results
the health care sector is an essential goal for reported for the wrong person, having
organizational performance evaluation. effects such as misdiagnoses and serious
Efficiency uses to be considered as the primary medication errors [4].
indicator of hospital performance [1].
The goal of this paper is to show how RFID 2.2 Potential Benefits of RFID technology
contributes to build an elegant hospital by
optimizing business processes, reducing errors The RFID solution to the above said
and improving patient safety. This section starts problem is to embed a tag into the blood
by a short introduction to the RFID technology bag label itself. The parametric who
and define some of its main concepts and transfuses the blood can scan the bag
standards. The second section describes some before transferring. He typically enters
interesting hospital use cases that could benefit the patient ID number and the patient
from RFID and the third section outlines the also has a wrist band RFID tag which
cleaning methods and finally developed a health identifies him uniquely. In case the
care system. We also summarized the open wrong blood bag is scanned, the reader
problems that still have to be solved before RFID can throw up a warning given below
is fully adopted by the healthcare community. and the patient is saved from wrong
treatment.[3]
recognized data cleaning approaches. [8] not been analyzed yet. [7] We summarize some
However it does not have good performance points that should be addressed in the near
when tag moves rapidly in and out of readers future:
communication range, reading frequency and 1. When talking about pasting radio frequency
velocity of tag movement. SMURF gives only tags on drug packages, there are concerns that
the empirical value of and does not tell how to exposure to electromagnetic energy could affect
calculate it [9]. To improve the algorithm product quality.
performance the size of the sliding window is 2. RFID-based systems can fail due to several
computed by adjusting the parameter . The reasons (e.g. RFID tags can be destroyed
simulation shows the error rate is lower and not accidentally or, communications can be broken
completely removed. due to interferences).There is a need for real-
time fault tolerant RFID systems able to deal
4 Patient Management System with situations in which patients lives could be in
danger.
The important data (e.g., patient ID, name, age, 3. RFID components interact wirelessly, thus,
location, drug allergies, blood group, drugs that attackers have plenty of opportunities to
the patient is on today) can be stored in the eavesdrop communications and obtain private
patients back-end databases for processing. The data of the patients. [6]These data can be used by
databases containing patient data can also be the eavesdropper to blackmail patients, or by an
linked through Internet into other hospitals insuring company to raise prices to their clients.
databases [5]. The Patient Management Systems Security and privacy in RFID technology is a
administrator can issue unused tag (wristband) to very active research field that has the challenge
every patient at registration time. Healthcare to design scalable and cheap protocols to
professionals (e.g., doctors, consultants) can guarantee the privacy and security of RFID
edit/update password protected patients medical users.
record for increased patient and data security by
clicking the Update Patient Button. This PMS References
can be implemented in departments (e.g.,
medicine, surgery, obstetrics and gynecology, [1] P.F. Drucker, The essential Drucker: selections
pediatrics) in both public and private hospitals from the management works of Peter F. Drucker,
for fast and accurate patient identification New York: HarperBusiness, 2001.
[2]. Sudarshan S. Chawathe, Venkat Krishnamurthy,
without human intervention. Using HPMS,
Sridhar Ramachandran, and Sanjay Sarma, "Managing
health care providers (e.g., hospitals) have a RFID Data", In Proceedings of the 30th VLDB
chance to track fast and accurate patient Conference, pp.1189-1195, 2004.
identification, improve patients safety by [3]. Belal Chowdhury and Rajiv Khosla, "RFID-based
capturing basic data (such as patient unique ID, Hospital Real-time Patient Management System,"
name, blood group, drug allergies, drugs that the ICIS, pp.363-368, In proceedings of 6th IEEE/ACIS
patient is on today), prevent/reduce medical International Conference on Computer and
errors, increases efficiency and productivity, and Information Science (ICIS 2007), 2007.
cost savings through wireless communication. [4]. J. Fisher and T. Monahan, "Tracking the social
The PMS also helps hospitals to build a better, dimensions of RFID systems in hospitals", In
Proceedings of International Journal of Medical
more collaborative environment between
Informatics, Vol. 77, Issue 3, pp. 176-183, 2007.
different departments, such as the wards, [5] S. Shepard, RFID Radio Frequency
medication, examination, and payment. Identification, The McGraw-Hall Companies,Inc.
USA, 2005.
5 Conclusions and Future Work [6] Agusti Solanas* and Jordi Castell-Roca RFID
technology for the health care sectorCRISES
Health care is an important sector that can obtain Research Group, UNESCO Chair in Data Privacy,
great benefits from the use of the RFID Dept. Computer Science and Mathematics, Rovira i
VirgiliUniversity Tarragona, Catalonia, Spain
technology. In this paper, we have analyzed the [7] O. Shoewu and O. Badejo, "Radio Frequency
use of RFID in the health care sector and also Identification Technology: Development, Application,
described some interesting applications with and Security Issues", The Pacific Journal of Science
promising perspectives. Although a number of and Technology, Vol.7, No.2, November 2006.
great ideas and systems can be found in the [8] Ge Yu, bspace: A Data Cleaning approach
literature, there is a number of issues that have for RFID data streams based on virtual spatial
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 448
2
Electronics And Communication , Anna University , PSN College of Engineering & Technology,
Tirunelveli, Tamilnadu, India
3
Avionics, Anna University , PSN College of Engineering & Technology,
Tirunelveli, Tamilnadu, India
and the GIF format. The JPEG method is more often used
Abstract for photographs, while the GIF method is commonly used
for line art and other images in which geometric shapes
This paper presents image compression using 9/7 wavelet are relatively simple.
transform based on the lifting scheme. This is simulated using
ISE simulator and implemented in FPGA. The 9/7 wavelet Other techniques for image compression include the use
transform performs well for the low frequency components. of fractals and wavelets. These methods have not gained
Implementation in FPGA is since because of its partial widespread acceptance for use on the Internet as of this
reconfiguration. The project mainly aims at retrieving the smooth
writing. However, both methods offer promise because
images without any loss. This design may be used for both lossy
and lossless compression. they offer higher compression ratios than the JPEG or GIF
Keywords: image compression, wavelet transform, methods for some types of images. Another new method
implementation that may in time replace the GIF format is
the PNG format.
A field programmable gate array (FPGA) contains a some of the samples of the high pass component without
matrix of reconfigurable gate array logic circuitry that, noticing any significant changes in signal. Filters from the
when configured, is connected in a way that creates a filter bank are called "wavelets".
hardware implementation of a software application.
Increasingly sophisticated tools are enabling embedded The other perspective to the same theory is based on the
control system designers to more quickly create and more fact that some signals, such as audio or video signals often
easily adapt FPGA-based applications. Unlike processors, carry redundant information. For instance, looking at the
FPGAs use dedicated hardware for processing logic and digital picture reveals that neighboring pixels often differ
do not have an operating system. Because the processing very slightly. The idea is to find a mathematical relation
paths are parallel, different operations do not have to that connects neighboring data samples (pixels) and
compete for the same processing resources. That means reduces their number. Of course, inverse process is needed
speeds can be very fast, and multiple control loops can run to reconstruct the original.
on a single FPGA device at different rates. Also, the The wavelet transform (WT) has gained widespread
reconfigurability of FPGAs can provide designers with acceptance in signal processing and image compression.
almost limitless flexibility. In manufacturing and Because of their inherent multi-resolution nature, wavelet-
automation contexts, FPGAs are well-suited for use in coding schemes are especially suitable for applications
robotics and machine tool applications, as well as for fan, where scalability and tolerable degradation are important.
pump, compressor and conveyor control[2]. Recently the JPEG committee has released its new image
coding standard, JPEG-2000, which has been based upon
2. Proposed Methodology DWT. Wavelet transform decomposes a signal into a set
of basis functions. These basis functions are called
The smooth variations in images are called the low wavelets.
frequency components where the sharp variations are the Wavelets are obtained from a single prototype wavelet y(t)
high frequency components. The low frequency called mother wavelet by dilations and shifting:
components forms the base of an image where the high 1 t b
frequency components add upon them to refine the image. a ,b (t ) ( )
Hence the averages or the smooth variations demands a a
(1)
more importance than details[3]. Hence performing 9/7
wavelet transform for smooth images gives better results. where a is the scaling parameter and b is the shifting
Lifting scheme is a technique for constructing second parameter.
generation wavelet transform.
2.2 2-D for DWT
2.1 Discrete Wavelet Transform
a)Split step
b)Predict step
This step predicts the odd elements from the even
elements. Fig. 3 Waveletting
c)Update step
This replaces the even elements with an average. Once all the 512 rows are processed, the filters are applied
in the Y direction. This completes the first stage of wave-
3. Block Diagram letting. While conventional Mallot ordering scheme
aggregates coefficients into the 4 quadrants, our ordering
scheme interleaves the coefficients in the memory. The
second stage of wave-letting only processes the low
frequency coefficients from the first stage. This
corresponds
to the upper left hand quadrant in the Mallot scheme.
Thus, second stage operates on row and columns of length
256, while the third stage operates on rows and columns of
length 128. The aggregation of coefficients along the 3
stages under Mallot ordering is shown in figure4. The
memory map with the interleaved ordering is shown in
figure 5.
(2)
4. Results
(3)
(4)
5. Conclusion
Fig. 6 The original image
Real time signals are both time-limited (or space limited in
the case of images) and band-limited. Time-limited signals
can be efficiently represented by a basis of block functions
(Dirac delta functions for infinitesimal small blocks). But
block functions are not band-limited. Band limited signals
on the other hand can be efficiently represented by a
Fourier basis. But sines and cosines are not time-limited.
Wavelets are localized in both time (space) and frequency
(scale) domains. Hence it is easy to capture local features
in a signal. Another achievement of a wavelet basis is that
it supports multi resolution. In the windowed Fourier
transform, the effect of the window is to localize the signal
being analyzed. Because a single window is used for all
frequencies, the resolution of the analysis is same at all
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 453
Acknowledgments
References
[1] Refael c. Gonzalez and Richard E. Woods Digital Image
Processing Delhi, India:Pearson Education, 2003
[2]Ayan sengupta, Compressing still and moving images with
wavelets , Multimedia Systems,Vol-2, No-3, 994
[3]Robbins,Advantages of FPGAs, Renee
Control Engineering [Control Eng.]. Vol. 57, no. 2, pp. 60-62.
Feb 2010.
[4]A.Grzeszezak, M. K. Mandal, S. Panchanathan and T. Yeap,
VLSI Implementation of discrete wavelet transform IEEE
transactions on VLSI systems, vol 4, No 4,pp 421-433, Dec 1996
[5] SIAM J. Math. Anal, The lifting scheme: A construction of
second generation wavelets, vol. 29, no. 2, pp. 511546, 1997.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, 1RMay 2011
ISSN (Online): 1694-0814
www.IJCSI.org 454
1
Pondicherry Engineering college,
Department of Computer Science and Engineering,
Pondicherry, India
2
Pondicherry Engineering college,
Department of Computer Science and Engineering,
Pondicherry, India
3
Pondicherry Engineering college,
Department of Computer Science and Engineering,
Pondicherry, India
4
Pondicherry Engineering college,
Department of Computer Science and Engineering,
Pondicherry, India
5
Pondicherry Engineering college,
Department of Computer Science and Engineering,
Pondicherry, India
structures that are more personalized, user friendly and In [1], a research note that provides a general introduction
effective means of e-learning. on e-learning has been discussed. This paper examines the
links between knowledge management and content
Researches in the education field show that it is difficult to management and discusses in detail about the various tools
find a general strategy of teaching when human differences necessary for knowledge management and content
are taken into account. In traditional classroom students management. It also dealt in detail about the advantages of
are able to interact with each other and their instructor is e-learning system and presents a consolidated six steps
able to socially construct their knowledge. In technology guide towards implementing e-learning. Agent based
based learning, this social aspect of learning is intelligent system have proved their worth in multiple
significantly reduced. The e-learning interaction is a one- ways. [2] introduced the application of an agent based
on-one relationship between the student and the intelligent system for enhancing e-learning. This paper
instructional content. This problem could be overcome by reports on the conceptual structure evolved to define
the usage of a recent technological advancement which is development process for pedagogical agents. An agent
the development of agent based software. An agent based based e-learning environment where users interact
e-learning offers potential solution regarding the problems collectively and intelligently with the environment is
in conventional learning. An agent can be used in e- discussed in [3]. This paper proposes the employment of
learning applications in different contexts. The various an agent based approach where agents are a natural
agent properties like autonomy, proactive and reactive metaphor of human acts and the learning systems are
behaviors, capability to co-operate and communicate with generally complex.
other agents makes it ideal for use in e-learning
applications. An agent-oriented software engineering methodology
tropos is proposed for an e-learning system which
An agent in e-learning application is situated in the incorporates various agents and gives a coarse grained
learning environment and performs the pedagogical tasks analysis for the e-learning system [4]. The base agent
autonomously. Agent based intelligent system (ABIS) have model is enriched by the beliefs, goals and plans making
proved their worth in multiple ways in education. ABIS the e-learning system more intelligent and flexible. [5]
goes far beyond conventional training records management Proposed a multi-agent system for an e-learning system
and reporting. Learners self-service, learning workflow, which consists of heterogeneous types of functional agents
provisions of online-learning, collaborative learning and that executes few functionalities of the distance learning
training resource management are some of the features of autonomously. Activities like perception, modeling,
ABIS. They are basically used for content management planning, coordination and task or plan execution are
and data persistence [2]. As enrichment over the ABIS, we suggested in this paper. A theoretical consideration of a
propose to use agents for various other activities in the real multi-agent system along with performance
system like providing feedback to the educational analyst comparison is proposed in [6]. This paper aims at full
and e-learning administrator on the quality of the tutorial, personalization of the e-learning process through an agent
offering self rating system for the e-learner, efficient based e-learning system. In this paper agent-specific
dynamic contents viewing and maintaining updated query techniques are mainly used for estimation knowledge
answering system. This would help to explore better the absorption, adjusting tasks to be suitable for an individual
agents property in an e-learning environment and reduce and optimization a whole performance of gaining
the overhead of human intervention providing an knowledge to be optimal for each student.
intelligent e-learning system for the end user.
[7] Illustrates advantages of customization of appropriate
e-learning resources and fosters collaboration in e-learning
2. Related Works environments. This paper proposes intelligent agents in
this system would support retrieval of relevant learning
There are numerous researches happening in the field of materials, support instructional design and analyze data.
software agents which has given rise to ideas in Agents can be used to generate learning progress reports
sophisticating E-learning. We present here some of the against predefined goals and can also document learning
related works done by different research scholars in the efficiency. [8] Investigates how e-learning applications are
areas of agent based e-learning, agent based architecture designed and how software systems improve their
for distance learning, etc. This chapter helps us to identify performance. It lists several educational perspectives that
the areas in which improvement can be enacted in the have been implemented and the nine distinctive stages of
existing e-learning system. implementation. It also proposes better software simulation
for social interactions and better performance of
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, 1RMay 2011
ISSN (Online): 1694-0814
www.IJCSI.org 456
Agent technology appears to be a promising solution to Agents hide the complexity of different tasks and monitor
challenges of modern environment. This appears as a high events and procedures. The agents properties make them
level of software abstraction and it is a part of artificial ideal for E- learning applications [7]. Agents apart from
intelligence. An agent can be defined as An encapsulated data mining, knowledge management, selecting tutorials
computer system that is situated in some environment and for the user, they also help in collaboration of the system.
that is capable of flexible, autonomous action in that User based agents significantly help reduce the
environment in order to meet its design objectives. Agent administration duties of the course and focus on response
is a process which operates in the background and to users questions or prepare training materials. The
performs activities when specific events occur [6]. The agents have been used in many areas of e-learning system
various properties of agents make them more suitable to at present. Yet, there remains a myriad of contexts where
environments where human intervention creates a great agents can be incorporated to make e-learning more
overhead. Agents are capable of relieving human efficient and fundamentally change the way education is
intervention significantly and help in proper functioning of being delivered. In this following section of the paper, we
the system. The various characteristics of agents are: discuss how agents can be incorporated for various
activities in e-learning system and how they can be better
Autonomy: Autonomy corresponds to the independence of utilized in a system.
a party to act as it pleases. Autonomous agents have
control both over their internal state and over their own
behaviour. 4. Proposed Work
Heterogeneity: Heterogeneity corresponds to the
In this section we propose an e-learning system based on
independence of the designer of a component to construct
the concept of agent oriented software. The following
the component in any manner.
agents can be utilized in an e-learning environment to
Proactive: A proactive agent is one that can act without
make the e-learning system efficient.
any external prompts. It acts in anticipation of the future
goals. 4.1 Personalization Agent
Reactive: the agent responds based on the input it received
and according to the environment. It responds in timely The perceiving capacity and the knowledge possessed vary
fashion to the environmental change. from one person to another. In a static e-learning
Communication: It can be defined as those interactions environment the tutorials or the resources do not vary and
that preserve the autonomy of the parties concerned. are not based on the capacity of the e-learner. For the user
Dynamism: the agents are dynamic as their reaction is to understand the concepts clearly the learning resources
dynamic and varies according to the environment. should be interactive, responsive and engaging with
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, 1RMay 2011
ISSN (Online): 1694-0814
www.IJCSI.org 457
knowledge formation emphasized. The personalization session. It not only lets the user know where he/she stands
agent used in an e-learning system would help the user to but also offers direct and indirect feedback on the
rank themselves. Based on their ranking, the agent selects efficiency of the tutorial to the tutor. The problems to be
learning materials and retrieves it based on cognitive style, generated dynamically for the user evaluation tests are
personal preferences and prior knowledge. The agent uses stored in a questionnaire database. The agent determines
a number of techniques and characteristics to filter retrieve the learners level of understanding from the problem
and categorize documents according to users predefined statement and the learners answers. The users score,
criteria. The personalization agent to a great extent helps difficulty level attempted, duration taken to answer the
the user to save time by personalizing the available questions and the topics in which the test was taken are all
resources and tutorials based on the users self evaluation. stored in a database for further analysis in the future by the
e-learning instructor.
4.2 Evaluation Agent
The evaluation agent plays a crucial role in the system by
evaluating the students performance after the tutorial
4.4 Feedback Agent agents interact with each other through message passing.
The message passing involves processing of incoming
The ultimate goal of a system cannot be achieved without messages, decoding, and takes corresponding actions. The
proper feedback. The effectiveness of any system depends interaction of the agents in the system is as shown in Fig.4.
greatly on the feedback timing and style. The feedback The e-learner ranks him based on his knowledge and the
agent collects the feedback and rating of the tutorials from personalization agent provides learning materials to the
the user. A reliable feedback from the user would enable to learner based on the criteria. The user makes use of the
improve the efficiency of the tutor and the quality of the tutorials and if any doubt arises, the user can report it. The
resources used in learning practice. This information query management system handles the question raised and
would help to determine the usefulness of a material for responses to the query as early as possible in the most
teaching specific topics and update materials to improve efficient way. The agent analyzes the users performance
their ranking by interacting with the user. and generates questionnaire accordingly.
in the questionnaire database. We classify the problems the learners capability. If the learner is able to answer the
stored in the knowledge base into four difficult levels: easy application oriented questions on a certain topic then the
medium, difficult, very difficult. The evaluation agent theoretical questions on that topic can be skipped by the
determines which difficulty level problem should be agent. Consider an OOPS learning session, where there are
generated to the user. When the learner reaches a certain four tutorials and four instructors. The agent monitors and
score of about 70% or more then the agent increases the collects details on the average number of hits per e-learner
difficulty level for the remaining problems. If the score is for a particular tutorial and the total number of user for that
less than 40% the agent retrieves easy questions from the particular learning material as in Table 1. Based on this
knowledge base and also rates the users understanding on data, the agent ranks the tutorials and provides feedback to
the concepts to be low. If the user consumes longer the corresponding e-learning instructor. This is pictorially
duration to answer questions on certain topics, there is a depicted in a graph through Fig 4. The agent provides this
possibility that the user is either referring other resources feedback to the instructor of the particular course after
or the users understanding on the particular concept is considering the average number of hits for the resource
relatively low. So, more of application oriented questions and its usage.
are retrieved by the agent in the above case, in order to test
Rating of
E-learner Average no. of hits / No. of E-learners
Learning material E-learning
instructor E-learner used
material
OOPS-1 Instructor1 2 50000 1
OOPS-2 Instructor2 4 10000 4
OOPS-3 Instructor3 2 25000 3
OOPS-4 Instructor4 3 40000 2
of link capacity {cl | l E}.To explain linearcode j (t+1) : i, j i (t) : l,j l(t).
multicast (LCM) with parallel communication network, we ,i (t+1)
present some terminologies, definitions and assumptions. ,i ,i : ,i,l l
Conventions: 1) In MMT network, the edges e.g., (1, 2)
(E) denotes that (1, 2) is a bi-directed edge [18], but
this edge may act as unidirectional depending on the
whereXi (t), Yj(t), ,i (t), ,i (t), and ,i,l (t) are the values
algorithm. 2) The information unit is taken as a symbol in
of variables at t time and represents the required memory.
the base field, i.e., 1 symbol in the base field can be
In terms of delay variable D these equation are as j (D)
transmitted on a channel every unit time [1].
: i, j i (D) : l,j l(t).
Definitions: 1) The communication in MMT network is
interblock and intrablock [18]. LNC is implemented in ,i (D) ,i, l (D) l (D).
blocks first and then in complete network. 2) A LCM on a :
communication network (G, o(l)) is an assignment of where
vector space v (i) to every node i and a vector v (i, j) u+1
,i,l (u)
to every edge i, j [1] such that ,i, l (D) u+1
1 ,i
v(o(l)) = ;
v(i, j) v (i) for every edge i, j; t
and i (D) i
for any collection d(l) of nonsource nodes in the network
1 : 1 i, j : i , j t
j (D) t , j (0) =0
Assumptions: 1) Each source process Xi has one bit per
unit time entropy rate for independent source process,
while larger rate sources are modeled as multiple sources. ,i (D) ,i (D)
t
, ,i (0) =0
2) Modeling of sources as linear combinations of
independent source processes for linearly correlated The above given coefficients can be collected into r |E|
sources. 3) Links with l E is supposed having a capacity matrices. These coefficients can be used from the
cl of one bit per unit time for both independent as well as transmission in parallel network. These matrices will be
linear correlated sources. 4) Both cyclic (networks with formed for both cyclic and acyclic cases.
link delays because of information buffering at i,j in the acyclic delay-free case
intermediate nodes; operated in a batched [2] fashion, i,j in the cyclic case with delay
burst [11], or pipelined [12]) and acyclic networks And B = ,i,l , and the matrix |E| |E|
(networks whose nodes are delay-free i.e. zero-delay) are l,j in the acyclic delay-free case
considered for implementation of LNC on parallel
l,j in the cyclic case with delay
networks, by analyzing parallel network to be a cyclic or Now, let us consider an example of parallel network ()
acyclic. 5) We are repeatedly using either of these terms (MMT), in which processor 1 (unique processor, without
processor and nodes, throughout the paper, which signify any incoming at that instant of time) to node 2and 3,
common significance. sends two bits, (1, 2) as given in figure 2.
1 2 3
The network may be analyzed to be acyclic or cyclic using
scalar algebraic network coding framework [13]. Let us
consider the zero-delay case first, by representing the Fig. 2. A row of a block of MMT with n = 3, where n is the number of
equation j : i, j i : l,j l. processors in MMT architecture. The detailed MMT architecture is given
The sequence of length-u blocks or vectors of bits, which in figure 3 for more simplicity to the readers.
are treated as elements of a finite field Fq, q = 2u. The (1,1,1,3)
information process Yj transmission on a link j is formed as (1,3,1,3)
a linear combination, in Fq, of link js inputs, i.e., source
(2,1,1,3)
processes Xi for which a(i) = o(j) and random processes Yl (2,3,1,3)
for which d(l) = o(j). The ith output process Z,i at receiver
node is a linear combination of the information processes (3,1,1,3)
on its terminal links, represented as (3,3,1,3)
,i : ,i, l l.
Memory is needed, for link delays on network for Fig. 3. 33 Multi-Mesh of Trees (MMT) (). (All interblock links are
multicast, at receiver (or source) nodes, but a memoryless not shown. The (1,3,1,3), (2,3,1,3), (3,3,1,3) are the processor index value
operation suffices at all other nodes [12]. The linear which is used to identify individual processors through-out the
architecture).
coding equation for unit delay links (considered) are
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 465
ISSN (Online): 1694-0814
www.IJCSI.org
Complexity
1, 2 1, 3 5
1 4
MM
2.5
MM
varies, so for each step different algorithms are used.
2
MMT Figure 6 shows first row in first block of the network and
1.5
the connectivity between the processors is based on the
1
topological properties of MMT [18]. We have considered
0.5
that each processor is having a Working Array (WA) which
0
n=1 n=2 n=3 n=4 consist of the processor index (Pn) and information
associated with that processor (In). The size of working
Fig. 4. Comparison of MMT and MM on the basis of Communication array is based on the size of network used, i.e. for n = 8,
links, Solution of Polynomial Equations, One to All and Row & Column the size of WA = 8.
Broadcast.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 466
ISSN (Online): 1694-0814
www.IJCSI.org
row.
WA1 WA2 WA3 WA4 WA5 WA6 WA7 WA8
2: repeat
3: until all nodes have received the information of root processors.
Fig. 6. Shows initial condition of processors containing WA(only one
row of a block of 8 8 MMT is shown) After the completion of step 2 the position of data in a row
is shown in figure 8. The data from the root node of a row
The figure 7 (a) shows the position of data after of all blocks of network receives the complete information
completion of step 1 and figure 7 (b) shows the content of of that row as the content of WA1.
WA1after step 1.
P1 P2 P2 P4 P5 P6 P7 P8
I1 I2 I3 I4 I5 I6 I7 I8 WA1 WA1 WA1 WA1 WA1 WA1 WA1 WA1
WA1
Fig. 8. After Step 2
Fig. 7. (a) After Step 1 (b) Content of WA1after Step 1
Algorithm 3. Step 3 of AAB
Algorithm 1. Step 1 of AAB
a. /* This operation is common between all root processors of
a. /* This operation is common between all processors of each each column of each block,
row of each block, ,, ,
b. The transfer is conducted in order
b. Each node is represented by ,, , ; where , are the block
,, ,
index and i, j are node index (see figure M) */
,, ,
c. The transfer is conducted in order 1: Starting from each column of each block of network, such that
,, ,
*/ the processor with greater index value will transfer data to
lower index processors linked according to the topological
1: Starting from each row of each block of network, such that the properties of network.
processor with greater index value will transfer data to lower 2: repeat
index processors linked according to the topological properties 3: Select nodes , , , ,, ,
,, , ,, , ,, ,
of network.
2: repeat from each block of network such that at each transfer the block
is divided in two parts (e.g. if N = 40, number of nodes in
3: Select nodes ,, ,
, ,, ,
, ,, ,
, ,, ,
blocks will also be 40 and division will be 1 to 20 and 21 to
from each block of network such that at each transfer the block 40th index position) and transfer message to remaining nodes
is divided in two parts (e.g. if N = 40, number of nodes in ,, , , ,, , , ,, , , linked according to
,, ,
blocks will also be 40 and division will be 1 to 20 and 21 to 40th
index position) and transfer message to remaining nodes topological properties of this network.
,, , , ,, , , ,, , , ,, ,
linked according to WA1 WA (1,2,3,4,5,6,7,8)
,, ,
, ,, ,
,, ,
will transfer respective Figure 9 shows the Step 3 and 4 in which the
messages to , , , linked according to communication is performed in each column of each block
,, , ,, , ,, , of the network. After the completion of step 4 each column
topological properties of this network. of each block of network consists of complete information
5: until all nodes have finished transmitting and forwarding. of respective column.
Algorithm 8. Step 8 of AAB (Interblock Communication) 4. Implementing LNC on AAB using MMT
/* The step is performed using the horizontal interblock links of
this network which transfers the information of all the blocks of In this section we implement network coding for each step
respective columns to the root processors of respective block to make the communication faster and increase the rate of
with processor index ( ) */ information transmitted from each node. We consider
network as delay-free (acyclic) and o(l) d(l). The
1: Starting from each blocks of each columns the information is algorithm results are analyzed later with n= 8 processors.
communicated to the root processors of respective block in For each step independent and different algorithms are
such a manner that the processor index ,, , . used (see section IV) and linear coding is implemented
2: In one communication step this information is broadcasted to with each algorithm. According to algorithm 1, data from
every root processor of respective block of respective column.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 468
ISSN (Online): 1694-0814
www.IJCSI.org
all processors are transferred with n = 8 and count = 1 to 2 data (from 1: 18+ 27 + 36 +45 + 54 + 63 + 72) to
i.e, 8/ 2 1 1 < 8/2 1 8 < 4 , which all the processors of respective row using intrablock links
means the processors 1, 2, 3 and 4 will receive data transfer, see figure 12.
from 5, 6, 7 and 8, shown in figure 10.
(18+ 27 + 36
The data from 1is broadcasted
1 2 3 4 5 6 7 8 + 45 + 54 + 63
to other processors of this row
+ 7 2 )
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
(a) (b) 1 2 3 4 5 6 7 8
Fig. 10. (a) Shows the indexing of processors with respect to nodes in the
figure. (b) Shows the direction of flow of data in step 1 of AAB
algorithm on MMT, 1, 2, 3and 4 are the processor receiving data and
5, 6, 7 and 8 are the sending processors. The dotted line distinguishes
between the receiving and sending processors in first iteration of step 1.
column in the order / 2 1 / 2 . (of respective block) with P_ID (i=n), and this requires
Time complexity of step 6: n3logn. one communication step.Time complexity of step 8:1CS.
11 11
Step 9: Using step 1 transfer of INFO of all the processors
with P_ID (i=n).Time complexity of step 9:n4logn.
Step 10: Call AAB algorithm in the block to transfer the
12 12 INFO of other blocks that column in the block with
14
The algorithm starts with the execution of each step in the
order defined (as step 1... step 10), as the execution of each
step starts the involvement of each processors also
15
increases to broadcast data. In parallel processing the
algorithm starts with active processor and involves other
16 processors as it progresses [22]. Figure 15 illustrate the
involvement of processors with average percentage of
17 iteration in each step.
18 3
Avg. Iteration in each step
Fig. 14. The data from each column root processor is broadcasted to other 2.5
processors of respective column in each block. 2
1.5
Step 7: Call onetoall algorithm [19] in the block to
1
transfer the INFO of other blocks (of respective rows) in
n3logntime.At the end of this step, complete blocks of each 0.5
row have INFO of all the processors in that row.Time 0
complexity of step 7: n3logn. 0 1 2 3 4 5 6 7 8 9 10
Step 8: This step performs the interblock communication Number of Processors
using horizontal link transfer that transfers the INFO (of Fig. 15. Involvement of processors at different steps of algorithm.
all the blocks of respective column) to the root processors
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 470
ISSN (Online): 1694-0814
www.IJCSI.org
Based on the above result in figure 15, as the iterations As the values of i and j changes the number of
increases the involvement of processors also increases. connecting horizontal link also varies.
The algorithm with LCM-PA approach, utilizes the Definition 2 (Vertical intrablock links). The processors in
maximum number of processors compared to without column j of each block , are also used to form a
LCM-PA approach. So the utilization of processors in binary tree rooted at , , 1, , 1 . That is, for
parallel architectures is also increases while using linear 1 to /2 processor , , , is directly
coding. connected to the processors , , 2 , and
, , 2 1, , whenever they exist.
Proof. If this network is used for N number of processors
6. Conclusion and Future Work than this type of link exists. Suppose N = 4, then total
number of processors in the network are 256
We have presented a LCM-PA, model of linear coding, on processors, which are divide in four rows and four
parallel architecture with efficient implementation of our columns and each row and column consists of four
approach on AAB algorithm on MMT, with comparative block, and each block consists of four rows and four
time complexity after implementation with LCM-PA. Our columns. Now according to definition 2, the processors
model is network independent and can be implemented on of block 1, 1 are connected in order:
any parallel architecture with assumptions to be common 1, 1, 1, 1 1, 1, 2, 1 1, 1, 3, 1 ;
as we have used in section second.Future work includes 1, 1, 2, 1 1, 1, 4, 1 ; //as 1, 1, 5, 1 does not
extensions to this approach and analyzing the complexity exist.
aspects by implementing with other parallel algorithms 1, 1, 1, 2 1, 1, 2, 2 1, 1, 3, 2 ;
(e.g. Multi-Sort [23]). In addition, to make the extension of 1, 1, 2, 2 1, 1, 4, 2 ;
this approach with LCM-PA model it is needed to be 1, 1, 1, 3 1, 1, 2, 3 1, 1, 3, 3 ;
implemented with other parallel algorithms to make vision 1, 1, 2, 3 1, 1, 4, 3 ;
of research more clear. 1, 1, 1, 4 1, 1, 2, 4 1, 1, 3, 4 ;
1, 1, 2, 4 1, 1, 4, 4 ;
As the values of i and j changes the number of connecting
Appendix horizontal link also varies.
Definition 3 (Horizontal interblock links). , 1 ,
Here we provide the proof of all theorems, definitions and the processor , , , 1 is directly connected to the
terms used with main text. The definitions used in this processor , , , , 1 , . It can be noted that
paper are defined by other authors but for readers for , these links connect two processors within the
convenience they are elaborated with proof in this section. same block.
Definition 1 (Horizontal intrablock links). The processors Proof. These are the links between the boundary or corner
in row i of each block , are connected to form a processors of different blocks. If this network is used for
binary tree rooted at , , , 1 , 1 . That is, for N number of processors than this type of link exists.
1 to /2 processor , , , is directly Suppose N = 4, according to definition 3, the processors
connected to the processors , , , 2 and for 1 are connected in order:
, , , 2 1 , whenever they exist. 1, 1, 1, 1 1, 1, 1, 4 ; 1, 1, 2, 1 1, 2, 1, 4 ;
Proof.If this network is used for N number of processors 1, 1, 3, 1 1, 3, 1, 4 ; 1, 1, 4, 1 1, 4, 1, 4 ;
than this type of link exists. Suppose N = 4, then total 1, 2, 1, 1 1, 1, 2, 4 ; 1, 3, 1, 1 1, 1, 3, 4 ;
number of processors in the network are 256 1, 4, 1, 1 1, 1, 4, 4 ; 2, 1, 2, 1 2, 2, 1, 4 ;
processors, which are divide in four rows and four As the values of , changes the number of connecting
columns and each row and column consists of four horizontal links also varies.
block, and each block consists of four rows and four Definition 4 (Vertical interblock links). ,1 , the
columns. Now according to definition 1, the processors
processor , , 1, is directly connected to the
of block 1, 1 are connected in order:
processor , , , , 1 , . It can be noted that
1, 1, 1, 1 1, 1, 1, 2 1, 1, 1, 3 ;
for , these links connect two processors within the
1, 1, 1, 2 1, 1, 1, 4 ; //as 1, 1, 1, 5 does not same block.
exist. Proof. These are the links between the boundary or corner
1, 1, 2, 1 1, 1, 2, 2 1, 1, 2, 3 ; processors of different blocks. If this network is used for
1, 1, 2, 2 1, 1, 2, 4 ; N number of processors than this type of link exists.
1, 1, 3, 1 1, 1, 3, 2 1, 1, 3, 3 ; Suppose N = 4, according to definition 3, the processors
1, 1, 3, 2 1, 1, 3, 4 ; for 1 are connected in order:
1, 1, 4, 1 1, 1, 4, 2 1, 1, 4, 3 ; 1, 1, 1, 1 1, 1, 4, 1 ; 1, 1, 1, 2 2, 1, 4, 1 ;
1, 1, 4, 2 1, 1, 4, 4 ;
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 471
ISSN (Online): 1694-0814
www.IJCSI.org
Miniimizatio
on of Ca
all Bloccking Prrobabillity by U
Using a
an
Adapttive He
eterogenneous Channe
C l Alloca
ation Sccheme ffor
Next Generat
G tion Wiireless H
Handofff Systeems
Debabrata Sarddar
S 1
b Raha1, Shubh
, Arnab na1, Shaik Sahil Babu1 , Prabirr Kr Naskar1 ,Utpal
hajeet chatterjeee2, Ramesh Jan
Biswas , M.K. Naskar1 .
3
1. Department of Electronics an
nd Telecommun
nication Engg, JJadavpur Univeersity, Kolkata 700032.
3. Department
D of Computer Science and Engg, University
U of K
Kalyani, Nadia, West Bengal, P
Pin- 741235.
Abstrract
NNowadays IEEE E 802.11 based d wireless locaal area network ks
(WWLAN) have beenb widely depployed for busin ness and person
nal
aapplications. Th he main issuee regarding wireless
w networrk
teechnology is handoff or hand h over management.
m Thhe
mminimization off handoff failu ure due to calll blocking is an a
immportant issue of research. Fo or the last feww years plenty of
researches had been
b done to red
duce the handofff failure. Here we
w
aalso propose a method
m to minimiize the handoff failure
f by using an
a
aadaptive heterogeeneous channel allocation
a schemme.
KKeywords: IEE EEE 802.11, Handoff
H failuree, GPS (Global
PPositioning Systeem), Channel alllocation, Neighbbor APs.
Figuure 1. Handoff proocess
Three sttrategies have been proposedd to detect the need for
11. Introducttion hand offf[1]:
mobile-ccontrolled-hanndoff (M
MCHO):The mobile
FFor last few years
y handoff becomes a burning issue in i station(M
MS) continuoously monitorrs the signalss of the
wwireless comm munication. Eveery base statio
on has a limiteed surroundding base statations(BS)and initiates the hhand off
nnumber of chan nnels. Thus a proper channeel distribution is process when some haandoff criteria aare met.
rrequired to perfform the handoff successfullly. networkk-controlled-haandoff (NCHO):The surroundding BSs
measuree the signal froom the MS annd the networkk initiates
11.1 Handoff the handdoff process whhen some handdoff criteria aree met.
WWhen a MS mo oves out of reaach of its curren
nt AP it must be
b mobile-aassisted-handofoff (MAHO):T The network asks the
rreconnected to a new AP to o continue its operation. Th he MS to m measure the ssignal from thhe surroundingg BSs.the
ssearch for a neew AP and su ubsequent regisstration under it networkk make the hanndoff decision based on repoorts from
cconstitute the handoff proceess which takees enough tim me the MS.
(called hando off latency) to interfere with propeer Handofff can be of manny types:
ffunctioning of many
m applicatiions. Hard & soft handoff ff: Originally hhard handoff wwas used
where a station mustt break connecction with thee old AP
before j oining the new w AP thus ressulting in largee handoff
delays. However, in soft handoff the old connnection is
maintainned until a neww one is established thus signnificantly
reducingg packet loss .
IJJCSI Internationall Journal of Compu
uter Science Issuess, Vol. 8, Issue 3, No.
N 1, May 2011
ISSSN (Online): 16994-0814
w
www.IJCSI.org 4773
Figure
F 2. Hard han
ndoff & Soft hando
off
FIFO
Q U E U E H-O
H
H-
H
. O
. +
N-
N
.
C
Figure 4.
4 Queue
H
H-O => Chann nels reserved fo or Hand-off on nly.
HH-O+ N-C=> Channels reseerved for Hand d-off and New w-
ccalls generated within the hex xagonal cell.
NNow the total number of ch hannels in a particular
p cell is F
Figure 5. Cell Cluster
ddivided among g the different types of calls like new callls, For the yellow, violett and orange coolored cells, thhe values
hhand-off calls, data calls etcc. In our case we are mainlly of n-c, h-o are not wwell-defined iee. they may bee varying
innterested with
h new calls an nd hand-off calls.
c If we caan widely with time annd so no deffinite relationn can be
ddevise a metho od for an optim mized and sysstematic way of o ascertainned.
ddividing the number
n of chaannels channel into channeels In this ccase we assum me that to be partitioned in to some
rreserved for haand-off and tho ose for new-calls and hand-o off sub-diviisions which hhave different relations betw ween n-c ,
bboth we can red duce the call blocking probab bility or in otheer h-o and hence the subbareas covered by the hexagoonal cells
ssense the hand--off failure. can be aassumed to be heterogeneous. Till date moost of the
HHere we assum me two kinds of o arrival rates:: a) n-c : arrival present work in the liiterature is bassed upon homoogeneous
rrate of new-callls b) h-o: arrival rate of hand
d-off calls. cells annd uniform naature of subarreas covered bby those
AAlthough call termination rates r (n-c and
d h-o) play an a homogeeneous cells.
immportant rolle in determ mining the call blockin ng Our schheme proposes different channnel allocation schemes
pprobabilities and
a thereby in i determining g the hand-o off for the ddifferent cases as shown abovve.
ffailure probabillity, but in ourr case for deterrmination of th he When n-c << h-o, it iis evident thatt the channels allocated
ffractions of tottal channels devoted
d to onlyy Hand-off an nd for (H-O O+N-C) as dennoted earlier sshould be much greater
IJJCSI Internationall Journal of Compu
uter Science Issuess, Vol. 8, Issue 3, No.
N 1, May 2011
ISSSN (Online): 16994-0814
w
www.IJCSI.org 4775
LLet,
CT=total numbeer of channels; CN-H
CH-N= number of channels reeserved for bo oth hand-off an
nd
nnew calls generrated within thee cells; CN-H
CH= number off channels reserrved only for hand-off.
h
WH-N=weightag ge on CH-N
WH=weightagee on CH 3 4
HHere we assum me WH-N+ WH=1.
DDetermination of the values of o WH-N, WH. 1 repres ents n-c= h-o= 0
WH-N= n-c / ( n-c
n +h-o) . (1) 2 repres ents n-c >>h-oo
WH= h-o /( n-c+
+ h-o) .(2) 3 repres ents n-c= h-o
0
EEquation (2) is i not so sign nificant in thiis case becausse 4 repres ents n-c << h--o( although thiis case is not im
mportant)
ssuppose for th he case n-c=0 , it doesnt really make an ny
eeffect if we take
WH-N= WH 1
1 .(3) as 4. Sim
mulation Ressults
hand-offf calls will be processed in any
a case.
Theereby We sim mulate our prooposed methodd by using thhe above
CH-N= WH-NH *CT= n-c / ( n-c+h-o)*CT (4)
conceptiion. For justify
fying the practiicability of ourr method
CH= WH*C CT= h-o / ( n-c+
+ h-o)*CT .(5) in real m
models we madde an artificial environment w where we
Which reafffirms our assu umption that: CT=CH-N+CH are goinng to apply ourr method. At ffirst we have coonsider a
.(6) case whhere the numbeer of channels rreserved for booth hand-
NNow channel alllocation can be b as varied as the following: off and new calls aree much greateer than the nuumber of
channelss reserved forr handoff callss(25%). Corresponding
CH result is shown in Figuure.6. Where w
we can see up to 25% of
the channnel, handoff probability is maximum andd no call
droppingg occurs at herre.
CH-N CH-N
1 2
Figure.6
we consider a ccase where thee number of channels
Next, w
reservedd for both the hand-off and nnew calls are equal to
the nummber of channnels reserved ffor handoff caalls(50%)
and the simulation ressult shown in bbelow. Here wee can see
up to 550% of the cchannel allocaation there is no call
droppingg probability.
IJJCSI Internationall Journal of Compu
uter Science Issuess, Vol. 8, Issue 3, No.
N 1, May 2011
ISSSN (Online): 16994-0814
w
www.IJCSI.org 4776
Refereences
[1] Yi-Bing Lin IImrich Chalmatc, Wireless annd Mobile
Network Archiitectures, pp. 117.
[2] AKYILDIZ, II. F., XIE, J., aand MOHANTY Y, S., "A
survey on mobbility managemeent in next geneeration all-
IP based wireless systtems," IEEE Wireless
Communicatioons, vol. 11, no. 44, pp. 16-28, 20004.
[3] STEMM, M. aand KATZ, R. H., "Vertical haandoffs in
wireless overlaay networks," A ACM/Springer JJournal of
Mobile Netwoorks and Appliccations(MONET T), vol. 3,
no. 4, pp. 335-350, 1998.
[4] Lin Y.B an and Chlamtac I , Wirelless and
MobileNetworrk Architecture, John Wiley and Sons
Inc., 2001, pp.660-65.
ure.7
Figu [5] Guerin R, Quueuing Blocking System with Tw wo Arrival
AAt last, we connsider the nummber of channels reserved fo or Streams and G Guard Channelss, IEEE Transaactions on
Communicatioons, 1998, 36:1533-163.
bboth hand-off and new calls are much sm maller than thhe
[6] Zeng A. A A, Mukumoto K. and Fukkuda A.,
nnumber of chan nnels reserved for handoff caalls (75%). Herre Performance Analysis of Mobile Cellullar Radio
aalso we can seee up to 75% off the channel alllocation there is System with P Priority Reservattion Handoff Prrocedure,
nno call droppin
ng probability. IEEE VTC-94,, , Vol 3, 1994, ppp. 1829-1833.
[7] Zeng A. A A, Mukumoto K. and Fukkuda A.,
Performance Analysis of Mobile Cellullar Radio
System withh Two-level Priority Reeservation
Procedure, IE EICE Transactiions on Comm munication,
Vol E80-B, Noo 4, 1997, pp. 5998-607.
[8] Jabbari B. & Tekinay S., Handover andd Channel
Assignment iin Mobile Celllular Networks, IEEE
Communicatioons Magazine, 300 (11),1991, pp.442-46.
[9] Goodman D. J, Trends inn Cellular and Cordless
Communicatioon, IEEE Com mmunications M Magazine,
Vol. 29, No. 6,, 1991, pp.31-400.
[10]] Zeng Q.A andd Agrawal D.P, Performance A Analysis of
a Handoff Schheme in Integraated Voice/Dataa Wireless
Networks, Prooceedings of IE EEE VTC-2000, pp. 1986-
1992.
[11]] Pavlidou F.N, Two-Dimensiional Traffic M Models for
Figu
ure.8 Cellular Mobbile Systems, IEEE Transacctions on
Communicatioons, Vol 42, Noo 2/3/4, 1994, ppp. 1505-
1511.
55. Conclusio
on [12]] Evans J. and Everitt D., Eff ffective Bandwiddth Based
Admission Coontrol for Multiiservice CDMA A Cellular
OOur proposed method aims at reducing handoff h time by
b Networks, IE EEE Trans. Vehiicular Tech., 48 (1),1999,
pp. 36-46.
rreducing the number
n of APs
A to be scaanned which is
[13]] Choi S. and Shin K.G , Predictive and Adaptive
aaccomplished byb fitting a treend equation to o the motion ofo Bandwidth Reeservation for H Handoffs in QoS S-Sensitive
thhe MS. This inn turn reduces the number off channels to be b Cellular Nettworks, In ACM SIGC COMM98
sscanned which brilliantly red duces the hand doff failure as is Proceedings, 1998, pp. 155-166.
cclear from the simulation
s pressented in the ab bove section. [14]] ] Levine D.A,, Akyildz I.F, aand Naghshinehh M , A
HHowever the proposed
p algorrithm may pro ove erroneous if Resource Estimmation and Call Admission Algoorithm for
thhe motion of the
t MS is too much random m to be used fo or Wireless Mulltimedia Netwoorks using thee Shadow
pprediction purpposes. Future works
w in this field may includ de Cluster Conceppt. IEEE/ACM Trans. On Netw working, 5
rresearch on more
m refined algorithms
a reg
garding channel (1),1997, 525-5537.
[15]] Lu S and Bharghavan V V, Adaptive Resource
aallocation. Erro
or estimation method
m may alsso be improved.
Management A Algorithms for IIndoor Mobile C Computing
It is worth meentioning heree that although the proposeed Environments, In ACM SIG GCOMM96 Prooceedings,
wwork has been presented conssidering honey ycomb structurees 231-242.
yyet our algorith
hm would work k in a similar manner
m for otheer [16]] Yuguang Fangg and Yi Zhang, Call Admissioon Control
ccell structures and neighbor AP locations. Minor changees Schemes and P Performance Annalysis in Wireleess Mobile
wwould be introdduced dependin ng on the netw work topology. Networks, IEEE Transaactions on Vehicular
Technology, V Vol. 51, No. 2, ppp. 371-382, Marcch 2002.
IJJCSI Internationall Journal of Compu
uter Science Issuess, Vol. 8, Issue 3, No.
N 1, May 2011
ISSSN (Online): 16994-0814
w
www.IJCSI.org 4777
2
Department of Computer Science and Engineering, St. Josephs College of Engineering, Chennai, Tamilnadu 600119, India
(b) under varying mobility Figure 2 Packet delivery ratio under various input conditions
Figure 1 Average throughput under various input conditions It can be observed that the PDR of AODV routing
protocol is higher than the ODMRP and Fisheye state
The FSR topology maintains up-to-date information routing protocols. Higher the PDR, higher is the number
received from neighboring nodes. The topology of legitimate packets delivered without any errors. This
information is exchanged between neighbors via Unicast. shows that AODV exhibits a better delivery system as
Each node maintains network topology map for distance compared with the other two. The reasons for the higher
calculations and when network size increases, the amount PDR ratio of AODV can be attributed to its good
of periodic routing information could become large. performance in large networks with low traffic and low
However the routing packets are not flooded. FSR mobility. It discovers routes on-demand, and effectively
captures pixels near the focal point with high detail. The uses available bandwidth. Also it is highly scalable and
details decrease as the distance from the focal point minimizes broadcast and transmission latency. Its efficient
increase. When the mobility increases the routes to remote algorithm provides quick response to link breakage in
destinations become less accurate. The route table size still active routes.
grows linearly with network size [14]. Hence throughput
of FSR could here been lower than AODV and ODMRP. Moreover the ability of a routing algorithm to cope with
the changes in routes is identified by varying the mobility.
Similarly for different mobility conditions too, ODMRP In this too the PDR of AODV protocol is higher as
routing protocol displays increased performance as compared to the other two. The same reasons for the better
compared to the other two. The ODMRP average PDR ratio of AODV under changing number of nodes can
throughput with node mobility is 5276.75 bytes per be given here too.
simulation time as against AODVs 3024.00 and FSRs
298.75. The same reasons as stated for the improved 4.3 End-to-End Delay
performance of ODMRP under differing number of nodes
can be given here too. The same behavior is experienced The total latency between the source and destination
in the previous studies too under similar conditions [12]. experienced by a legitimate packet is given by end-to-end
delay. It is calculated by summing up the time periods
experienced as processing, packet, transmission, queuing
and propagation delays. The speed of delivery is an
important parameter in the present day competitive
circumstances.
References
[1] Nadjib Badache, Djamel Djenouri and Abdelouahid Derhab
Mobility Impact on Mobile Ad hoc Routing Protocols In
ACS/IEEE International Conf. on AICCSA03, July 2003.
[2] Ian D.Chakeres and Elizabeth M.Belding-Royer AODV
Routing Protocol Implementation Design International
(a) under varying nodes Conf. on Distributed Computing Sysmtes(ICDCSW04)
IEEE, vol.7 2004
[3] Ravi Prakash, Andre Schiper and Mansoor Mohsin Reliable
Multicast in Mobile Networks Proc. of IEEE 2003(WCNC)
[4] Weiliang Li and Jianjun Hao Research on the Improvement
of Multicast Ad Hoc On-demand Distance Vector in
MANETS IEEE Vol.1 2010
[5] M.Gerla et al., On-demand multicast routing protocol
(ODMRP) for ad hoc networks. Internet draft,<draft-ietf-
manet-odmrp-04.txt>,(2000)
[6] Shapour Joudi Begdillo, Mehdi Asadi and Haghighat.A.T.
Improving Packet Delivery Ratio in ODMRP with Route
Discovery, International Jour. Of Computer Science and
Network Security, Vol.7 No.12 Dec 2007.
[7] Gu Jian and Zhang Yi, A Multi-Constrained multicast
(b) under varying mobility Routing Algorithm based on Mobile Agent for Ad Hoc
network International Conference on Communications and
Figure 3 End-to-End Delay under various input conditions Mobile Computing, IEEE 2010.
[8] Thomas Kunz, and Ed Cheng, On Demand Multicasting in
Ad hoc Networks: Comparing AODV and ODMRP, Proc,
of the 22nd IEEE International Conf. on Distributed
5. Conclusions Computing Systems(ICDCS02),Vol-2, pp 1063-6927(2002)
[9] Narendra Singh Yadav and R.P.Yadav, The Effects of Speed
Performance of the various routing protocols such as on the Performance of Routing Protocols in Mobile Ad-hoc
ODMRP, AODV and FSR were evaluated in this study. Networks, Int. Journal of Electronics, Circuits and Systems,
The following conclusions were drawn. Vol. 1, No.2, pp 79-84 (2009)
[10] S. Corson, J. Macker, Mobile ad hoc networking
(MANET) :Routing protocol performance issues and
Both under varying number of nodes and differing
evaluation considerations, Internet Draft(1999)
values of mobility Average throughput is higher for [11] Guangyu pei, Mario Gerla, Tsu-Wei Chen, Fisheye State
the routing protocol ODMRP. The maximum Routing in Mobile Ad Hoc Networks, Proc. Of IEEE
throughput of ODMRP is 43% higher than the ICC00 (2000)
maximum of AODV and FSR under varying nodes [12] Yudhvir Singh,Yogesh Chaba,Monika Jain and Prabha Rani
condition. Performance Evaluation of On-Demand Multicasting
AODV has a higher ratio of legitimate packet delivery Routing Protocols in Mobile Adhoc Networks IEEE
as compared with the other routing protocols International Conf. on Recent Trends in
evaluated, ODMRP and FSR. The maximum packet Information,Telecomm and Computing 2010.
[13] Samir R.Das Charles E.Perkins and Elzabeth M.Royer
delivery of AODV is 38% higher than the maximum
Performance Comparison of Two On-demand Routing
of ODMRP and FSR under varying nodes condition. Protocols for Ad Hoc Networks IEEE INFOCOM 2000.
ODRMP performs better in avoiding network [14] Mario Gerla, Xiaoyan Hong, Guangyu Pei, Fisheye State
congestion as compared to AODV and FSR. The Routing Protocol (FSR) for Ad Hoc Networks, INTERNET-
better your paper looks, the better the Journal looks. DRAFT-<draft-ietf-manet-fsr-03.txt> (2002)
Thanks for your cooperation and contribution. [15] Mehran Abolhasan and Tadeusz Wysocki Displacement-
based Route update strategies for proactive routing protocols
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 482
2
Principal, Cummins College of engineering for Women, Pune University.
Pune, India.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 484
Some of the common constraints used for matching in There are many motivations behind using color
stereo correspondence are as explained below: information in stereo correspondence. Firstly,
chromatic information is precisely obtained form CCD
Epipolar constraint: Corresponding points must sensors of digital cameras. Secondly, recent
lie on corresponding epipolar lines. developments in this area have proved that chromatic
information plays an important role in human
Continuity constraint: Disparity tends to vary stereopsis. Thirdly, it is obvious that a red pixel cannot
slowly across a surface match with a green or blue pixel even if their
intensities are same. Thus color information will
Uniqueness constraint: A point in one image potentially improve the performance of the matching
should have at the most one corresponding algorithm.
match in the other image.
The color space used here is RGB and the metric used
is MSE. For color images we use MSE, defined as:
Ordering constraint: the order of features along
epipolar lines is the same. MSE color (x, y, d) = (1)
Even though a general problem of finding As explained in the introduction the proposed
correspondences between images involves the search algorithm is based on the assumption that in an image
within the whole image, once a pair of stereo images is generally there is a background and there are objects
rectified so that the epipolar lines are horizontal scan placed on the background. Thus it is obvious that the
lines, a pair of corresponding edges in the right and left column discontinuities are more than the row
images should be searched for only within the same discontinuities. Based on this explanation we have
horizontal scanlines. Thus we have used rectified modified the program such that the search zone for the
images as inputs to our algorithm. pixel match depends on the disparity of its neighbor in
the row just above. Except if there exists a column
discontinuity then the algorithm will search the
complete search zone for the perfect match. This is
where the image gradient comes into picture. The
II. Color Information for matching: column discontinuity is detected by computing the
gradient in the column direction.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 485
c) Output of traditional local area
based correspondence algorithm
for color images
a) The right image of the stereo pair d) Output of inter-row dependency
algorithm
b) Output of traditional local area
based correspondence algorithm
for grayscale images
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 486
methods: local area correspondence algorithm for grey [1] Mohammed Rziza, Ahmed Tamtaoui, Luce Morin and
and for color images and our proposed algorithm: inter- Driss Aboutajdine, Estimation and segmentation of a
row dependency algorithm respectively; for a set of 5 Dense Disparity Map for 3D Reconstruction IEEE
Transaction, 2000.
images.
[2] D. Scharstein and R. Szeliski. A taxonomy and
Table I evaluation of dense two-frame stereo correspondence
Images Grey Color Inter-row algorithms" International Journal of Computer Vision,
47(1/2/3):7-42, April-June 2002.
Barn2 90.68% 97.02% 90.13% [3] Jinhai Cai,Fast Stereo Matching: Coarser to Finer with
Selective Updating ,Proceedings of Image and Vision
Poster 89.70% 96.37% 88.48% Computing New Zealand 2007, pp. 266270, Hamilton,
Venus 88.38% 96.44% 88.47% New Zealand, December 2007.
Tsukuba 89.79% 93.93% 85.72% [4] Hajar Sadeghi, Payman Moallem and S. Amirhassn
Sawtooth 92.76% 96.98% 91.65% Monadjemi, Feature Based Dense Stereo Matching
using Dynamic Programming and Color, International
Journal of Information and Mathematical Sciences,2008.
The average search time for a 383 x 434 image in case [5] Jiang Ze-tao, Zheng Bi-na, Wu Min and Chen zhong-
of local area based correspondence algorithm is 220 xiang, A 3D Reconstruction Method Based on Images
seconds while in case if our inter row dependency Dense Stereo Matching, IEEE Proceedings of
algorithm is 30 seconds. From the table I, the International Conference on Genetic and Evolutionary
percentage of matched pixels of our algorithm is Computing, 2009.
almost the same as the traditional algorithm for
grayscale images. From these factors we can conclude [6] Lu Yang, Rongben Wang, Pingshu Ge, Fengping Cao,
that our algorithm achieves good trade off between Research on AreaMatching , Algorithm Based on
Feature-Matching Constraints IEEE Proceedings of
accuracy and search time.
2009 fifth International Conference on Natural
Computation.
References:
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 487
2
Professor, Computer Engineering Department, M. M. Engineering College, M. M. University, Ambala, India -
133207.
Abstract
Mobile Ad hoc networks (MANETs) are self-created and self under an increased load [3]. Many protocols have been
organized by a collection of mobile nodes, interconnected by proposed but a few comparisons have been made with
multi-hop wireless paths in a strictly peer to peer fashion. respect to scalability. The routing protocols Dynamic
Scalability of a routing protocol is its ability to support the Source Routing (DSR), Ad hoc On-demand Distance
continuous increase in the network parameters (such as mobility
rate, traffic rate and network size) without degrading network
Vector (AODV) and Temporally Ordered Routing
performance. The goal of QoS provisioning is to achieve a more Algorithm (TORA) protocol had been analyzed
deterministic network behaviors, so that information carried by theoretically and through simulation using an Optimized
the network can be better delivered and network resources can be Network Engineering Tools (OPNET) by varying node
better utilized .In this paper, we are going to analyze the impact density and number of nodes [4].
of scalability on various QoS Parameters for MANETs routing The effect of scalability of a network on Genetic
protocols one proactive protocol (DSDV) and two prominent on- Algorithm based Zone Routing Protocols by varying the
demand source initiated routing protocols. The performance number of node is analyzed in [5].In [6], simulation have
metrics comprises of QoS parameters such as packet delivery been conducted to investigate scalability of DSR ,AODV
ratio, end to end delay, routing overhead, throughput and jitter.
The effect of scalability on these QoS parameters is analyzed by
and LAR routing protocols using prediction based link
varying number of nodes, packet size, time interval between availability model. Simulation results of the modified DSR
packets and mobility rates. (MDSR) as proposed in [7] has less overhead and delay as
Keywords: MANETs, Scalability, QoS, Routing Protocols. compared to conventional DSR irrespective of network
1. Introduction size. In [8] simulation based comparative study of AODV,
DSR, TORA and DSDV was reported which highlighting
Mobile Ad hoc networks (MANETs) are self-created and that DSR and AODV achieved good performance at all
self organized by a collection of mobile nodes, mobility speed whereas DSDV and TORA perform poorly
interconnected by multi-hop wireless paths in a strictly under high speeds and high load conditions respectively.
peer to peer fashion [1]. The increase in multimedia, In [9] showed the proactive protocols have the best end-to-
military application traffic has led to extensive research end-delay and packet delivery fraction but at the rate of
focused on achieving QoS guarantees in current networks. higher routing load. In [10] three routing protocols were
The goal of QoS provisioning is to achieve a more evaluated in a city traffic scenarios and it was shown that
deterministic network behaviors, so that information AODV outperforms both DSR and the proactive protocol
carried by the network can be better delivered and network FSR. In [11] simulation study of AODV, DSR and OLSR
resources can be better utilized. The QoS parameters differ was done which shown that AODV and DSR outperform
from application to application e.g., in case of multimedia OLSR at higher speeds and lower number of traffic
application bandwidth, delay jitter and delay are the key streams and OLSR generates the lowest routing load.
QoS parameters [2].After receiving a QoS service request, In[12] more limited study was conducted which favoring
the main challenges is routing with scalable performance DSR in terms of packet delivery fraction and routing
in deploying large scale MANETs .Scalability can refer to overhead whereas OLSR shows the lowest end-to-end
the capability of a system to increase total throughput delay at lower network loads. In[13] simulation based
performance comparison on DSDV, AODV and DSR is
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 488
done on the basis of Packet delivery ratio, Throughput, metrics, Section 4 simulation results and analysis are
End to End delay & routing overhead by varying packet discussed and section 5 concludes the paper.
size, time interval between packet sending & mobility of
nodes on 25 nodes using NS2.34. In [14] author performed 2. Overview of Routing Protocols
realistic comparison between two MANETs protocols Routing protocols for MANETs have been classified
namely AODV (reactive protocol) and DSDV (proactive according to the strategies of discovering and maintaining
protocol). It is analyzed that the performance of AODV routes into three classes: proactive, reactive and Hybrid
protocol is better than the DSDV protocol in term of PDF, [18]
Average end-to-end delay, packet loss and routing Destination-Sequenced Distance Vector (DSDV):
overhead by taking fixed number of nodes and varying DSDV is a table-driven routing [9] scheme for MANETs.
number of nodes which helps in improving scalability of The Destination-Sequenced Distance-Vector (DSDV)
MANETs. In [15] author evaluated the scalability of on- Routing Algorithm is based on the idea of the classical
demand ad hoc routing protocols by taking of up to 10,000 Bellman-Ford Routing Algorithm with certain
nodes. To improve the performance of on-demand improvements. Every mobile station maintains a routing
protocols in large networks, five modification table that lists all available destinations, the number of
combinations have been separately incorporated into an hops to reach the destination and the sequence number
on-demand protocol, and their respective performance has assigned by the destination node. The sequence number is
been studied. It has been shown that the use of local repair used to distinguish stale routes from new ones and thus
is beneficial in increasing the number of data packets that avoid the formation of loops.
reach their destinations. Expanding ring search and query Dynamic Source Routing (DSR): is an on-demand
localization techniques seem to further reduce the amount protocol designed to restrict the bandwidth consumed by
of control overhead generated by the protocol, by limiting control packets in ad hoc wireless networks by eliminating
the number of nodes affected by route discoveries. While the periodic table-update messages required in the table-
the performance improvements of the modifications have driven approach [19]. The major difference between this
only been demonstrated with the AODV protocol. In [16] and other on-demand routing protocols is that it is beacon-
author proposed an effective and scalable AODV (called less and hence does not require periodic hello packet
as AODV-ES) for Wireless Ad hoc Sensor Networks (beacon) transmission, which are used by a node to inform
(WASN) by using third party reply model, n-hop local ring its neighbors of its presence. The basic approach of this
and time-to-live based local recovery. The above said protocol (and all other on-demand routing protocols)
work goal is to reduce time delay for delivery of the data during the route construction phase is to establish a route
packets, routing overhead and improve the data packet by flooding Route Request packets in the network. The
delivery ratio. The resulting algorithm AODV-ES is destination node, on receiving a Route Request packet,
then simulated by NS-2 under Linux operating system. responds by sending a Route Reply packet back to the
The performance of routing protocol is evaluated under source, which carries the route traversed by the Route
various mobility rates and found that the proposed routing Request packet received.
protocol is better than AODV. In [17] moreover, most of Ad hoc On-demand Dis tance Vector (AODV): AODV
current routing protocols assume homogeneous routing protocol is also based upon distance vector, and
networking conditions where all nodes have the same uses destination numbers to determine the freshness of
capabilities and resources. Although homogenous routes. AODV minimizes the number of broadcasts by
networks are easy to model and analysis, they exhibits creating routes on-demand as opposed to DSDV that
poor scalability compared with heterogeneous networks maintains the list of the entire routes. To find a path to the
that consist of different nodes with different resources. destination, the source broadcasts a route request packet.
The author studies simulations for DSR, AODV, LAR1, The neighbors in turn broadcast the packet to their
FSR and WRP in homogenous and heterogeneous neighbors till it reaches an intermediate node that has
networks. The results showed that these which all recent route information about the destination or till it
protocols perform reasonably well in homogenous reaches the destination. A node discards a route request
networking conditions, their performance suffer packet that it has already seen. The route request packet
significantly over heterogonous networks uses sequence numbers to ensure that the routes are loop
In this paper, the impact of scalability on QoS Parameters free and to make sure that if the intermediate nodes reply
such as packet delivery ratio, end to end delay, routing to route requests, they reply with the latest information
overhead, throughput and jitter has been analyzed by only.
varying number of nodes, packet size, time interval
between packets & mobility rates. The rest of paper is
3. QoS Based Performance Metrics
organized as follow. In section 2, gives an overview of The performance metrics includes the following QoS
routing protocols, section 3 describe the performance parameters such as PDR (Packet Delivery Ratio),
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 489
Noof receivedPackets
the data packets delivered to the destinations to those
generated by the CBR sources. This metric characterizes
both the completeness and correctness of the routing
protocol also reliability of routing protocol.
n
CBRrece
PDR 1
n
* 100
1
CBRsent
Simulation Time (sec)
Average End to End Delay: Average End to End delay is Figure 2(Packets Received when number of nodes=25 packet
size=500 bytes, interval=0.15 sec Mobility=1000)
the average time taken by a data packet to reach from
source node to destination node. It is ratio of total delay to In scenario 01, Figure 2 shows that packet received in
the number of packets received. AODV and DSR is higher as compared to DSDV. The
n
(CBRrecetime CBRsenttime) result in Table 3 shows that PDR, throughput, end to end
Avg _ End _ to _ End _ Delay 1
n
*100 delay is same in AODV and DSR is better than DSDV.
CBRrece
1 Routing load is minimum in AODV. Jitter is less in
Throughput: Throughput is the ratio of total number of DSDV as compared to AODV and DSR but throughput
delivered or received data packets to the total duration of and PDR is also very low.
Table 3 (Performance Matrix number of nodes=25 packet size=500
simulation time.
n bytes, interval=0.15 sec Mobility=1000)
CBRrece
Table-3 Packets PDR End- Thr Routi Jitter
Throughput 1
Sent/ End oug ng (sec)
simulation time Received Delay hput Load
the total number of received data packets at destination. size=500 bytes , interval=0.15 sec Mobility=1000)
Routing _ Load
RTRPacket Table-4 Packets PDR End Thr Rou Jitter
CBRrece Sent/
Received
-
End
oug
hput
ting
Loa
(sec)
Jitter: Jitter describes standard deviation of packet delay DSR 60/51 85.00 5.83 5.66 0
7.60 176.09
Parameters Value
No of Node 25,50,75,100
Simulation Time 10 sec
ts
Queue Length 50
c
fre
minimum routing load and jitter from DSR. We have also DSDV
DSR
600/55
600/105
9.16
17.50
2.16
1.78
6.11
11.66
10.90
16.39
14.27
14.66
analyzed that in DSDV Jitter, end to end delay is low as Table 10 (Performance Matrix number of nodes=100 packet size=500
compared to AODV and DSR but throughput, number of bytes, interval=0.015 sec Mobility=1000)
Received End
performance of AODV is best as four QoS parameters AODV 600/208 34.66 4.64 23.11 8.45 89.00
DSDV 600/64 10.66 2.09 7.11 9.37 14.35
out of six has favourable results as indicated in Table 4, DSR 600/113 18.83 1.97 12.55 14.94 45.8
Noof receivedPackets
oo
N c
free e
iv a
dP
oofreceivedPackets
e
kts
Received Delay hput Load
a
dPc
AODV
e
60/12 20.00 1.85 1.33 7.08 141.34
ceiv
DSDV
fre
60/7 11.66 2.08 0.77 8.57 106.66
oo
DSR
N
60/13 21.66 1.99 1.44 23.15 156.70
Table 12(Performance Matrix number of nodes=50 packet size=1000
Simulation Time (sec)
bytes, interval=0.15 sec Mobility=1000) Figure 13(Packets Received number of nodes=100 packet size=1000
ts
60/59 98.33 5.76 6.55 7.61 176.60
oo
N c
free e
iv a
dPce
k
Simulation Time (sec)
Figure 14(Packets Received number of nodes=25 packet size=1000
ackets
DSR
edP
ackets
Receiv Delay
ed
DSDV 60/7 11.66 2.06 0.77 8.57 107.03 bytes, interval=0.015 sec Mobility=1000)
parameters with these numbers of nodes as depicted in bytes, interval=0.015 sec Mobility=1000)
DSR
c
Simulation Time (sec) DSDV 60/7 11.66 2.06 0.77 8.57 107.03
Figure17 (Packets Received number of nodes=100 packet size=1000 DSR 60/12 20.00 2.41 1.33 60.08 712.08
ts
compared to DSR and DSDV, when numbers of nodes are
ivd
e a
Pce
k
scalable from 25, 50, 75 and 100. AODV is also having
o
No e
frce
the highest PDR and throughput with minimum routing
load and jitter relative to DSR. We have also analyzed that Simulation Time (sec)
Figure 21(Packets Received number of nodes=100 packet size=500
in DSDV, Jitter, end to end delay is low as compared to bytes, interval=0.15 sec Mobility=2000)
s
e
et
iv
k
c
e
a
P
c
fre
e
vd
oo
ece
i
N
o
Nfr
o
k
e
c
k
Pa
a
Pc
d
e
d
iv
e
e
iv
c
e
e
c
fr
e
o
fr
o
N
o
No
Table- Packets PDR End Throu Routin Jitter bytes, interval=0.15 sec Mobility=2000)
26 Sent/ - ghput g Load
Received End
AODV 600/230 38.33 4.30 25.55 6.23 80.04 Table-30 Packets PDR End Thr Routin Jitter
DSDV 600/62 10.33 2.07 6.88 9.67 14.33 Sent/ - oug g Load
DSR 600/104 17.33 1.77 11.55 17.09 80.04 Received End hput
AODV 60/58 96.66 5.61 6.44 5.36 150.84
DSDV 60/7 11.66 2.06 0.77 8.57 107.03
DSR 60/22 36.66 5.43 2.44 20.90 618.23
o
No e
frciv
e d
e a
Pce
kts
oofreceivedPackets
Simulation Time (sec)
Figure 25(Packets Received number of nodes=100 packet size=500
N
e
ivdPc
akts
e
shown in Table 25 and Table 26. In scenario 07, Figure 26 shows, when number of nodes 25
the number of packets received in AODV and DSR equal,
Table 27 (Performance Matrix number of nodes=25 packet size=1000
AODV
Received
60/11 18.33
Delay
1.76
hput
1.22
Load
20.36 122.7
scalable from 50, 75 and 100 the number of received
DSDV
DSR
60/7
60/11
11.66
18.33
2.08
1.76
0.77
1.22
8.57
9.18
106.6
122.7 packets and performance of DSR degrades. The overall
Table 28 (Performance Matrix number of nodes=50 packet size=1000
AODV
Receiv
60/58 96.66
Delay
5.57
hput
6.44 5.81 151.02
29 and Table 30.
DSDV 60/6 10.00 2.13 0.66 10.00 100.02
DSR 60/36 60.00 6.75 4.00 17.75 595.09
put Load
k
Sent/ End
a
dPc
Received Delay
ce e
iv
AODV
fre
Received Delay
e
iv a
dPc
References
2.
Radar.Department,.M.T.C.College
. Cairo, Egypt
3
Electronics and Communication Engineering Department, Helwan University
Cairo, Egypt
4
Electronics and Communication Engineering Department, Helwan University
Cairo, Egypt
measurements to be considered. In data association tracking targets in presence of various clutter densities.
process, the gating technique [1] in tracking a maneuvering Simulation results showed better performance when
target in clutter is essential to make the subsequent compared to the two conventional NNKF, JPDA algorithm.
algorithm efficient but it suffers from problems since the
gate size itself determines the number of valid included
measurements. Another problem in case of tracking 2. Background
multiple targets, data association becomes more difficult
because one measurement can be validated by multiple 2.1 Kalman Filter Theory
tracks in addition to a track validating multiple
measurements as in the single target case. To solve these
problems, the important of an alternative approaches Based on Kalman filter estimation [21], we list the filter
known as nearest neighbor data association (NNDA) [2-5], model. The dynamic state and measurement model of target
probabilistic data association (PDA) [6,7], joint t can be represented as follows
probabilistic data association (JPDA) [7,8], and multiple
hypothesis Tracking (MHT) [9], etc. has been used to track x t k A t k 1x t k 1 wt (k 1) t 1,2,..., T (1)
multiple targets by evaluating the measurement to track
association probabilities with different methods to find the
state estimate [10-12]. NNDA that depends only on z t k H t k x t k v t k t 1,2...T (2)
choosing the nearest valid measurement to the predicted
target position, has been used in real work widely because Where x t (k 1) is the n x 1 target state vector. This state
of its low calculation cost, but it readily miss-tracks in can include the position and velocity of the target in space
dense cluttered environment. PDA, JPDA and MHT need
x ( x, y, x , y ) , The initial target state, x t (0) for t = 1,2,
prior knowledge and some of them have large calculation
amount [13-16]. We propose here an extended algorithm t
..., T , is assumed to be Gaussian With mean m 0 and
applied to conventional NNDA to be able to track the
multi-target in dense clutter environment. This proposed t
known covariance matrix p . Where the unobserved
0
algorithm is more accurate to choose the true measurement
originated from the target with lower probability of error signal (hidden states) x ( k ) : k N , x t ( k ) X be
t
and less sensitivity to false alarm targets in the gate region modeled as a Markov process of transition probability
size than NNDA algorithm. Depending on the basic
principle of moving target indicator (MTI) filter used in p xt k | xt k 1 and initial distribution p x t 0
radar signal processing [16-20] which get rid from the
N ( x t 0 ; m t , p t ) . z k is the m x 1 measurement
fixed targets and the targets that moving with lower t
velocity and their moving distance lower than specified 0 0
vector, At (k 1) denotes state transition matrix, H t k
certain threshold value, the proposed algorithm reduces the
number of candidate measurements in the gate by MTI
filtering method that compares the moving distance denotes measurement matrix, w t (k 1) and v t k are
measure for each measurement in the current gate at the
mutually independent white Gaussian noise with zero
update step to all previous measurement in the same gate at
mean, and with covariance matrix Q(k-1) and R(k),
the predicted step and then avoids any measurement in the
respectively.
current gate moves a distance less than the threshold value
The innovation mean (residual error) of measurement
due to comparison. Thus, decreasing the number of
candidate measurements in the current gate lead to z i k is given by
decreasing the probability of error in data association
V ti (k ) z i k z k
t (3)
process. The main key to detect the moving or fixed false
target is the innovation parameter that measure the moving where
z t k H t k m t k
distance between the current measurement and the
predicted target position. By calculating this parameter for (4)
all measurement in the current gate compared with the and the predicted state mean and covariance is defined as
scanned previous measurement in the same gate, the m t k A t k m t k 1 and
optimum innovation of the candidate measurement is
obtained. This is called optimum innovation data p t k A t k p t k 1 A t k Q (5)
association (OI-DA) method which is combined with
NNDA algorithm to apply the proposed algorithm in multi Then, we can update state by
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 498
ISSN (Online): 1694-0814
www.IJCSI.org
1
z i k H t k m t k S t k z i k H t k m t k The resulting assignment problem may be solved by the
(10) algorithms based on shortest augmenting paths [24]. The
. algorithm yields associations that enable tracks to be
where denotes correlation gate. If there is only one updated with their assigned measurement. Tracks not
measurement, this can be used for track update directly; receiving a measurement are predicted but not updated.
otherwise if there is more than one measurement, we need
to calculate the equivalent measurement.
3. Optimum Innovation Data Association
2.2 Nearest Neighbor Kalman Filter
The NNKF suffers from tracking in dense clutter
The NNKF is theoretically the most simple single-scan
environment and its performance is degraded with many
recursive tracking algorithm. The NNKF consists of a
loss-tracks accordingly, a new suboptimal algorithm
discrete-time Kalman filter (KF) together with a
optimum innovation data association (OI-DA) is introduced
measurement selection rule. The NNKF takes the KFs
to increase the tracking performance and to be able to track
state estimate x(k-1 | k-1) and its error covariance P(k-1 |
maneuvering targets in heavy clutter. The main idea based
k-1) at time k-1 and linearly predicts them to time k. The
on detecting or distinguishing between the clutter
prediction is then used to determine a validation gate in the
measurements in the gate of the predicted target and the
measurement space based on the measurement prediction
measurement originated from the moving target using two
z t k | k 1 and its covariance S(k) . When more than successive scan. The measurements at time k-1 that lie in
one measurement z i k fall inside the gate, the closest one
the gate of the predicted target position (predict to time k)
is processed by the following method with the
to the prediction is used to update the filter. The metric measurements at time k that lie in the same gate to obtain
used is the chi-squared distance: the optimum innovation corresponding to distance metric
between true target measurement and the predicted target
2
1
D i t V ti S t k V ti
z i k z t k S t k 1 z i k z t k
.
. (11)
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 499
ISSN (Online): 1694-0814
www.IJCSI.org
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 500
ISSN (Online): 1694-0814
www.IJCSI.org
z (k )
1
z (k 1) 1
t
Z k z1 k ,.. z j k ,.. z mj k from the set Z k
z 5
(k ) z k)
( 2
where j =1 to mj (number of detected points in t th gate at
z5 (k 1) z (k 1)2
time k) and Z t k be a set of all valid points z j k that
0
z 4 (k ) z t
vy4 y z (k 1)
3
satisfy the distance measure condition
(k 1)
z k z t k t k 1 k z t k
4
z (k )
3 z j
S
z j
for
vx j (k ) zx j ( k ) HZx(k )
of the candidate points detected in the t th gate Gt k 1 of (14)
vy j ( k ) zy j ( k ) HZy ( k ) ; j 1,2,..., mj
predicted position z t k whose elements are a subset
from the set Z k 1 where i =1 to mi ( number of Each point j in Gt k has nearest point i in Gt k 1 by
calculating the minimum absolute difference value
detected points in gate Gt k 1 at time k-1) and Z t k 1
(vx j , vy j ) and its index (vxI j , vyI j ) between the
be a set of all valid points z i k 1 that satisfy the calculated innovation means for all point i at each point j
distance measure condition as follow;
k 1 z t k t k 1 k 1 z t k vx j min ( vx j ( k ) vx i (k 1) )
zi S zi i 1,2,..mi
for each target t where is threshold value that vy j min ( vy j (k ) vy i (k 1) )
determines the gate size and l =1 to w n , i =1 to mi, i. e i 1,2,..mi
(15)
for each target t, i is initialized by 1 and is increased by i vxI j arg min ( vx j (k ) vx i (k 1) )
= i +1 after each valid point is detected up to last mi i 1,2,..mi
detected points.
vyI j arg min ( vy j (k ) vy i (k 1) )
In the updating step, let Z k z1k , z 2 k ,.... z w k be
c
i 1,2,..mi
(16)
a set of points in the 2-D Euclidean space at time k where
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 501
ISSN (Online): 1694-0814
www.IJCSI.org
consideration we detect how many points mI represent a (vx opt , vy opt ) is calculated by selecting the measurement
clutter point (i.e the corresponding measurements j are not that has the maximum change in distance under
valid and are avoided from data association process) and condition vx j , vy j as follow,
how many point mV represent a target point (i.e
corresponding measurements j are valid and one of them
has the optimum index that is found by data association
*
j arg max vx j 2 vy j
2
(21)
process) .The data association process take in j 1to mI
consideration the optimum innovation mean
vx opt vx j * (k )
(vx opt , vy opt ) directly in case that the number of detected (22)
vy opt vy j * (k )
points mV is one, which is the normal case when the - The target not detected in the gate (missed) and all
target exist and the remaining points represent a measurements are considered to be false target. In this
clutter(invalid points) case, the updated target is assigned to the predicted target
vx opt vx j (k ) position and no innovation mean value is required i.e
(17) vx opt 0
vy opt vy j (k )
. (23)
Another case that data association process take in vy opt 0
consideration the optimum innovation mean w
(vx opt , vy opt ) directly when existing target with no clutter Finally, we obtain the optimum innovation mean that is
without entering in calculation model of innovation mean related to the true selected target with decreasing the
process. i.e. the calculated number of detected point mj is probability of error and is used in updating target to the
one in Gt k .
correct position. Reducing the number of valid points in
the t th gate by detecting the false measurement to be
vx opt vx j (k )
, where j=1 (18) invalid (i.e not include in the data association process),
vy opt vy j (k ) this increase the probability for choosing the true
measurement originated from the target and improve the
Two special cases may be occurring according to the
data association process.
scenario in the following application assignment:-
The first case, gate contain more than one moving target
and mV>1 as a result of data association process. The 4. Implementation of Optimum Innovation
optimum innovation mean (vx opt , vy opt ) is calculated by
Data Association (OI-DA) using the
NNDA as follow; kalman filter.
*
j arg min
j 1to mV
vx j (k ) 2 vy j (k ) 2 (19) We propose an algorithm which depends on the history of
observation for one scan and uses innovation mean
vx opt vx j * (k ) calculation with a fixed threshold to obtain the optimum
(20) innovation mean that is related to the association pairing
vy opt vy j * (k )
between the choosing measurement and track (predicted
target) and is used in update state estimation of the target.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 502
ISSN (Online): 1694-0814
www.IJCSI.org
In conventional data association approaches with a fixed By accepting only those measurements that lie inside the
threshold, all observations lying inside the reconstructed gate t 1
Z k Z : z j k H t k m t k S t k
gate are considered in association. The gate may has a t
large number of observations due to heavy clutter, this
leading to; increasing in association process since the
probability of error to associate target-originated
z j k H t k m t k
measurements may be increased. In our proposed where st k H t k P t k H t k R
algorithm detecting moving target indicator (MTI) filter is
used to provide the possibility to decrease the number of 3. Calculate innovation mean for all measurement lie
observations in the gate by dividing the state of inside the gate t at time k-1and k respectively
observations into valid represent moving targets and vxi (k 1) zx i (k 1) HZx(k )
invalid represent the fixed or false targets that only the
vy i (k 1) zy i (k 1) HZy (k ) ; i 1,2,...mi
valid are considered in association. The proposed OI-DA
using Kalman filter is represented in algorithm 1. vx j (k ) zx j (k ) HZx (k )
where
vx j min ( vx j (k ) vx i (k 1) )
m t k At k mt k 1 i 1,2,..mi
p t k A t k p t k 1 A t k Q vy j min ( vy j (k ) vy i (k 1) )
3. Calculate optimum innovation mean V opt (k ) by OI- i 1,2,..mi
DA algorithm described in algorithm 2 vxI j arg min ( vx j (k ) vx i (k 1) )
4. Do update step i 1,2,..mi
mt (k ) m t (k ) K t (k )V opt (k ) vyI j arg min ( vy j (k ) vy i (k 1) )
i 1,2,..mi
p t ( k ) p t ( k ) K t ( k ) S t ( K ) K t ( k ) 5. Calculate invalid mI measurement (false target) in case
t vxI j vyI j and mV measurement (true moving target) in
S t ( K ) H t ( k ) p ( k ) H t ( k ) R ( K )
t 1 case vxI j vyI j
K t (k ) p (k ) H t ( K ) S t ( K )
- Calculate directly the optimum innovation .
5. end for (
v opt vx opt opt , vy ) in case (mV = 1, j = index(mV))
or ( mj = 1, j = 1)
vx opt vx j ( k )
Algorithm 2 Calculate V opt (k ) by OI-DA vy opt vy j (k )
- Choose NN of mV valid measurement to be the .
1. Find validated region for measurements at time k-1: . optimum innovation v opt (vx opt , vy opt ) in case . . . . .
t
Z k 1 z i k 1 ,
i 1,...mi . (mV > 1, j = index(mV)) .
By accepting only those measurements that lie inside the
gate t:
*
j arg min
vx j (k ) 2 vy j (k ) 2
Z k 1 Z : zi k 1 H t k m t k S t k
t 1 j 1to mV
vx opt vx j * (k )
zi k 1 : H t k m t k
vy opt vy j * (k )
- Choose the measurement to be the optimum
2. Find validated region for measurements at time k: innovation v opt (vx opt , vy opt ) that has the maximum
Z k z j k ,
t j 1,...mj change in distance under condition
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 503
ISSN (Online): 1694-0814
www.IJCSI.org
vx j , vy j in case mV = 0, mI = mj,
t 2
j=index(mI) as follow, 0
2
*
j arg max vx j 2 vy j
2
j 1to mI
t 2
40 0
0
vx opt vx j * (k )
G
R , 2
vy opt vy j * (k ) 0 40 t 0
- Otherwise the above condition, the optimum will be
set as
t
vx opt 0
0
6. Conclusions
From the results obtained in the simulations for multi-
target tracking, it can be seen that at low clutter density
(high SNR), all the tracking algorithm (NNKF, JPDAF
and OI-DA) are able to track the targets. However, at
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 504
ISSN (Online): 1694-0814
www.IJCSI.org
Y POSITION
and the moving true targets to be valid during data
11
association process. The OI-DA algorithm overcome the
NNKF problem of loss tracking the targets in dense clutter
12
environment and has the advantage of low computational
cost over JPDAF. By using this new approach, we can True path f or target 1
13 True path f or target 2
obtain smaller validated measurement regions with True path f or target 3
Tracked target path no 1 by OIDA algorithm
improving the performance of data association Process Tracked target path no 2 by OIDA algorithm
Tracked target path no 3 by OIDA algorithm
14
which have been shown to give targets the ability to
12 14 16 18 20
continue tracking in dense clutter. X POSITION
(c)
Fig. 3. X- and Y- trajectory show the state of successful tracking to
maneuvering multi-targets (3 target with + symbol for tracked target
8
position and solid line for true target path) move in low clutter using 3
approaches algorithm (a) NNKF (b) JPDAF (c) OI-DA.
9
10
Y POSITION
11
12
(a) (b)
True path for target 1
13 True path for target 2
True path for target 3
Tracked target path no 1 by NNKF algorithm
Tracked target path no 2 by NNKF algorithm
Tracked target path no 3 by NNKF algorithm
14
12 14 16 18 20
X POSITION
(a)
8
(c) (d)
9
10
Y POSITION
11
12
(e) (f)
True path for target 1
13 True path for target 2
Fig. 4. The state of tracking 3 targets move in different clutter density
True path for target 3
Tracked target path no 1 by JPDA algorithm
Tracked target path no 2 by JPDA algorithm
using 3 approaches algorithm NNKF as in (a),(b), JPDAF as in (c),(d)
14
Tracked target path no 3 by JPDA algorithm and OI-DA as in (e),(f). Images (a),(c),(e ) show tracking in medium
12 14 16 18 20 clutter and images (b),(d),(f ) show tracking in dense clutter
X POSITION
(b)
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 505
ISSN (Online): 1694-0814
www.IJCSI.org
8 8
9 9
10 10
Y POSITION
Y POSITION
11 11
12
12
(a) (a)
8
8
9
9
10
10
Y POSITION
Y POSITION
11
11
12
12
9
10
Y POSITION
10
11
Y POSITION
11
12
12
True path for target 1
13 True path for target 2
True path for target 3 True path for target 1
Tracked target path no 1 by OIDA algorithm 13 True path for target 2
Tracked target path no 2 by OIDA algorithm True path for target 3
Tracked target path no 3 by OIDA algorithm Tracked target path no 1 by OIDA algorithm
14 Tracked target path no 2 by OIDA algorithm
Tracked target path no 3 by OIDA algorithm
12 14 16 18 20 14
X POSITION 12 14 16 18 20
X POSITION
(c)
Fig. 5 X- and Y- trajectory show the state of tracking 3 targets in medium (c)
clutter (+ symbol refer to tracked target position and solid line to true
target path) using 3 approaches algorithm (a) NNKF and (b) JPDAF loss Fig. 6 X- and Y- trajectory show the state of tracking 3 targets in dense
track while (c) OI-DA maintains tracks clutter (+ symbol and solid line refer to tracked target position and true
target path respectively) using 3 approaches algorithm (a) NNKF and (b)
JPDAF loss track while (c) OI-DA maintains tracks
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 506
ISSN (Online): 1694-0814
www.IJCSI.org
0.5 0.5
OIDA for target 1 OI-DA for target 1
OI-DA for target 2 OI-DA for target 2
OIDA for target 3 OI-DA for target 3
NNKF for target 1 NNKF for target 1
NNKF for target 2 NNKF for target 2
NNKF for target 3 NNKF for target 3
0.4 JPDA for target1 0.4 JPDA for target 1
JPDA for target 2 JPDA for target 2
JPDA for target 3 JPDA for target 3
0.3 0.3
RMS ERROR
RMS ERROR
0.2 0.2
0.1 0.1
0 0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35
FRAME NUMBER
FRAME NUMBER
(a) (c)
0.5
Fig. 7 The root mean square error[RMSE] for each target (3 targets)
OIDA for target 1
OI-DA for target 2 separately over frame number (each frame take 4 sec / one scan) for the 3
OIDA for target 3
NNKF for target 1 approaches algorithm as (a) with low clutter ,(b) with medium clutter and
NNKF for target 2
NNKF for target 3 (c) with dense clutter . From (b), (c) the RMSE is maintained minimum
0.4 JPDA for target1
JPDA for target 2 for the proposed OI-DA and less sensitivity to dense clutter.
JPDA for target 3
0.3 References
RMS ERROR
2
Dept. of Applied Physics & Electronic Engineering, Rajshahi University, Rajshahi University, 6205, Bangladesh
0.5
Traffic Source and End
flow destination Start time 0.4
d e la y ( s )
Setting the traffic flow in such a manner aims at greater Another reason could be that, with the QAODV routing
interference impact when sessions overlap. The source protocol, the number of transmitted routing packets is
node and the destination node of each traffic flow are larger than the number of routing packets transmitted in
chosen by using function cbrgen.tcl randomly. the AODV routing protocol. In the QAODV routing
protocol, all nodes use Hello messages to exchange
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 511
information with their neighbors. Routing packets sent during the routing searching and maintenance periods
including Hello messages which have higher priority without exchanging Hello messages. The Hello messages
always transmitted firstly and data packets are queued are needed in the QAODV routing protocol in order to
nodes. With the AODV routing protocol, when the traffic exchange the precisely consumed data rate information of
is low in the network, no matter which route the traffic nodes who are sharing the same channel. It is hard to
flow chose, the route chosen can provide enough data rate explain why the routing overload badly increase when data
at most of the time. As a result, the end to end delay with rate increases from 1500 kbps to 1800 kbps .
the AODV routing protocol is not high and can be lower
than the QAODV routing protocol at low data rate. If we 4.1.3 Packet Delivery Ratio
can take more time for simulation for each data rate
comparatively accurate results can be found. For these From figure.5 we see that, either we use the QAODV
above reasons, end to end delay in QAODV is higher than routing protocol or the AODV routing protocol, the packet
the AODV at low data rate. The average end to end delay delivery ratio decreases with the increase of the data rate
of the QAODV is always below 240ms ,whereas, the end of traffic flows.
to end delay of the AODV increases badly when the data
rate of each traffic flow increases from 600 kbps to 1200
kbps. It shows that networks with the QAODV routing AODV QAODV
protocol can provide lower end to end delay for traffic 1.2
Pa c k e t D e liv e ry R a tio
flows than the AODV since the QAODV always choose to
1
find a route with satisfying data rate. During the
transmission, the QoS of the traffic is monitored in the 0.8
QAODV routing protocol. Once the QoS is not satisfied as 0.6
it promised, the traffic stopped. All in all, with the
0.4
QAODV routing protocol, the average end to end delay is
low even the load on the network increases to very high 0.2
which is not true for the AODV routing protocol. This 0
performance is very significant for real time traffic 50 300 600 900 1200 1500 1800
transmissions. Data Rate(kbps)
0.016
0.014
4.2.1 Average end to end delay 0.012
0.01
As shown in figure:6, with the increase of the maximum 0.008
moving speed, the average end to end delay does not 0.006
0.004
increase much in QAODV as compared with the AODV 0.002
routing protocol, it means that, this protocol is quite 0
suitable for scenarios with different moving speeds. 1 5 10 15 20
0.5
0.4 Fig. 7 Normalized routing load with different Max. moving speeds
improves the performance at the expense of sending more INTERNET CONNECTIVITY FOR MOBILE AD HOC
routing packets on the network. These packets are used to NETWORKS, Proceedings of the Second Australian
exchange the network information to help assure QoS. Undergraduate Students Computing Conference, 2004.
[2] Ronan de Renesse, Mona Ghassemian, Vasilis Friderikos,
A. Hamid Aghvami, "Adaptive Admission Control for Ad
4.2.4 Packet Delivery Ratio Hoc and Sensor Networks Providing Quality of Service"
Technical Report, Center for Telecommunications Research,
In figure. 8 with low max moving speed the packet King.s College London, UK, May 2005.
delivery ratio in QAODV is higher than the AODV but [3] H. Badis and K. Al Agha, Quality of Service for Ad hoc
with the increase of mobility speed the performance is Optimized Link State Routing Protocol (QOLSR), IETF-63
lower than AODV. When the maximum moving speed is Meeting, Internet Engineering Task Force, draftbadis-
up to 20 m/s, almost half of the packets are dropped in manet-qolsr-02.txt, Vancouver, Canada, November 2005.
QAODV. The reason that why more packets are dropped Draft IETF.
in QAODV and how they are dropped has been explained [4] NS manual, available at: http://www.isi.edu/nsnam/ns/ns-
in the previous part of this section. documentation.html.
[5] Mario Joa Ng, ROUTING PROTOCOL AND MEDIUM
AODV QAODV ACCESS PROTOCOL FOR MOBILE AD HOC
1 NETWORKS, Ph.D. Thesis (Electrical Engineering),
Packet Delivery Ratio
0.8
Polytechnic University, Hong Kong, January-1999.
[6] R. Ramanathan and M. Steenstrup, Hierarchically
0.6 organized multihop mobile wireless networks for quality-of-
service support, ACM/Baltzer Mobile Networks and
0.4
Applications, 3(1):101119, 1998.
0.2 [7] Z. J. Haas and S. Tabrizi, On some challenges and design
choices in ad hoc communications, Proceedings of IEEE
0
MILCOM98, 1998.
1 5 10 15 20 [8] S. Murthy and J. J. Garcia-Luna-Aceves, An efficient
Maximum Moving Rate(m/s) routing protocol for wireless networks, ACM Mobile
Networks and App. J. Special Issue on Routing in Mobile
Communication Networks, 1(2):183197, 1996.
Fig. 8 Packet delivery ratio with different Max. moving speeds. [9] C. E. Perkins, E. M. Royer, and S. R. Das, Multicast
operation of the ad hoc on-demand distance vector routing,
Proceedings of Mobicom99, pages 207218, 1999.
5. Conclusion [10] D. B. Johnson and D. A. Maltz, The dynamic source
routing protocol for mobile ad hoc networks, In Tomasz
Imielinski and Hank Korth, editors, Mobile Computing,
In this research, we described the importance of QoS chapter 5, pages 153181. Kluwer Academic Publishers,
routing in Mobile Ad-Hoc networks, the challenges we 1999.
met, and the approach we took. We discussed in detail our [11] V. D. Park and M.S. Corson, A Highly Adaptive
idea of adding support for QoS into the AODV protocol. Distributed Routing Algorithm for Mobile Wireless
After observing the simulation and analyzing the data, it is Networks, Proceedings of INFOCOM, pp. 1405-1413,
found that packets could get less end to end delay with a 1997.
QoS based routing protocol when the traffic on the [12] Z. Haas, A new routing protocol for the reconfigurable
wireless networks, In Proc. of the IEEE Int. Conf. on
network is high. This low end to end delay is meaningful
Universal Personal Communications, 1997.
for real time transmissions. When the traffic is relatively [13] P. Bose, P. Morin, I. Stojmenovic and J. Urrutia, Routing
high on the network, not all the routes that are found by with guaranteed delivery in ad hoc wireless networks,
the AODV routing protocol have enough free data rate for ACM DIALM 1999, 48- 5; ACM/Kluwer Wireless
sending packets ensuring the low end to end delay of each Networks, 7, 6, 609-616, November-2001.
packet. As a result, the QAODV protocol works well and [14] Palaniappan Annamalai, Comparative Performance Study
shows its effects when the traffic on the network is of Standardized Ad-Hoc Routing Protocols and OSPF-
relatively high. People who work on the area of ad hoc MCDS, Virginia Polytechnic Institute and State University,
networks with the aim of improving the QoS for ad hoc October-2005.
[15] L Xue, M S Leeson and R J Green, Internet Connection
networks can get benefit from this QAODV protocol.
Protocol for Ad Hoc Wireless Networks, Communications
& Signal Processing Group, School of Engineering,
References University of Warwick, Coventry CV4 7AL-2004.
[1] Shane Bracher, A MECHANISM FOR ENHANCING [16] Yuan Sun Elizabeth M. Belding-Royer, Internet
Connectivity for Ad hoc Mobile Networks,
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 514
Authors
Tapan Kumar Godder received the
Bachelors, Masters and M.Phil degree
in Applied Physics & Electronics from
Rajshahi University, Rajshahi. In 1994,
1995 and 2007, respectively. He is
courrently Associate Professor in the
department of ICE, Islamic University,
Kushtia-7003, Bangladesh. He has
twenty one published papers in
international and national journals. His
areas of interest include internetworking,
AI & mobile communication.
ABSTRACT: The revolution in web world led to access privileges reside remotely over network as
increasing users needs, demands and expectations. services accessed by web browser which is used for
By the time, those needs developed starting from input and display purposes [4].
ordinary static pages, moving on to full y dynamic As previously stated, web operating system
ones and reaching the need for services and
applications to be available on the web!.. Those
though its novelty - has drawn attention and many
demands changed the perspective of our web today to attempts have been made. WOS [3-9], the first
whats said to b e a clo ud of computing that aims known web-based operating system that provided a
mainly to p rovide applications as services for web platform that enabled user to benefit from
user. As t ime goes by, applications were just not computational potential of the web. WOS provided
enough; users needed their applications and data users with plenty of tools through using a virtual
available anytime, anywhere. For these reasons, desktop using the notion of distributed computing
traditional operating system functionality was needed by replicating its services between multiple
to be provided as a s ervice that integrates several interacting nodes to manipulate user requests. WOS
applications together with users data.
In this paper we present the detailed description,
consists of three major components, graphical user
implementation and evaluation of SEWOS [1]- a interface, resource control unit which processes
semantically enhanced web operating system- that user request and finally a remote resource control
provides the feel, look and mimic traditional desktop unit which manages requests passed from other
applications using desktop metaphor. nodes.
The interest in web operating systems and their
Keywords: Web Operating System, Semantic, applications on academic communities resulted in
Ontology, Service Oriented Architecture. VNet which was developed at the University of
Houston and considered an access point to campus
1. INTRODUCTION resources. VNet included variety of services that
The World Wide Web has become a major support students such as Desktop, admin
delivery platform for a variety of complex and management, contact management, file
sophisticated applications in several domains. In management services, calendar and scheduling
this context, researchers investigated the ability to services, report generation services, etc [10].
extend traditional web-based applications' Based on the earlier work of WOS WEBRES
functionality` to enable users to interact with was developed. WEBRES investigated the aspects
applications in much the same way as they do with of resource sharing that wasnt addresses in WOS
desktop applications. Web operating systems were and presented the notion of resource set which
developed to provide users with an environment makes resources persistent rather that bounded to a
that pretty much resembles traditional desktop specific user[11].
environment through web browser. They represent G.H.O.S.T (http://g.ho.st/vc.html), EyeOS
an advance in web utilities as they aim to provide "www.eyeos.com" and DesktopTwo
better operational environments by moving users' "www.desktoptwo.com " are examples of systems
working environment within web site including that were built based on the trends of web operating
managing his/her files, installing his applications. systems. They mimic the look, feel and
Web operating system can be defined as a virtual functionality of the desktop environment of an
desktop on the web, accessible via a browser as an operating system. Moreover, they present variety of
interface designed to look like traditional operating applications such as: File management, Address
system with multiple integrated built-in book, Calendar and text editing applications.
applications that allow user to easily manage and Implementing such application requires
organize his data from any location[2]. Web considering users requirements in all phases as the
operating system provides users with traditional final evaluation requires user participation and
operating system applications as services available intervention. This paper is organized as follows; the
for user to access transparently without any prior next section presents SEWOS general architecture.
knowledge about where service is available, the In section 3, implementation of SEWOS and
cost or constraints [3]. In web operating system, applications is provided. In section 4 presents the
applications, data files, configurations, settings and evaluation of the proposed system. Our conclusion
and future work is presented in section 5.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011 516
ISSN (Online): 1694-0814
www.IJCSI.org
menu with his recent file list, his events, his 3- Personalized start m enu: User's start
favorite resources and applications. Besides his menus includes four tabs as follows:
start menu, user can start any application 1- Recent tab: This tab contains a list of
directly using application icon on his desktop. user's personalized book-marking
Moreover, user can start and deal with multiple displaying a set of resources that were
applications at the same time. Options to accessed during user's last visit.
manage workspace preferences are also 2- Events tab: This tab contains a list of
available and accessible through personalized user's events that are associated with
desktop. today's date.
The implementation of SEWOS home page 3- Favorite Files tab: this tab contains a
and personalized desktop is shown in Fig 2. list of ranked files that are favorably
System's home page in Fig.8.a contains: accessed by user during that time of
1- Welcome message/Log out: system the day.
identifies user and displays a welcome 4- Favorite Applications tab: Contains
message as an application to a list of SEWOS applications that are
aforementioned salutation personalization accessed by user during this time of
function. The system also gives user the the day.
ability to log off at any time during 4- User Calendar and analog clock: those
navigation. two tools are added to user work space
2- Personalized work space: this includes and can be hidden/ shown based on user
user's personalized background, calendar preferences.
and clock. User can choose to display In the next section a detailed description of
clock, calendar or not and he can choose SEWOS's embedded applications, interface
his own background using preferences descriptionetc.
dialog.
4- Function Buttons: File manager has This section includes the basic options to
capabilities to create new folder, insert pictures, tables and hyperlinks
upload/download and delete resource. within text.
As previously stated, this manager's 4) File operations section
functionality is incomplete unless there exists a This section includes buttons that enables:
way for user to restore his deleted files. This Creating new document.
includes having a personalized recycle bin Opening an existing document with
which we consider as a main part of SEWOS extensions (.txt, .sav and .docx).
File system. This is described in the next Saving user's documents to user's
section. space with an extension (.sav).
Print preview of user document.
3.1.1 SEWOS RECYCLE BIN
Printing user's document to user's
SEWOS Recycle bin completes the
printer.
functionality of the underlying file system by
acting as intermediate storage space for user's Displaying XML code behind
resources before they can be permanently document authoring.
deleted from the system. Recycle bin includes
options either to restore deleted resource or to 3.3 SEWOS WEB BROWSER AND SEARCH
APPLICATION
delete it permanently from system.
Navigating the web is one of the main
3.2 SEWOS TEXT EDITOR activities of almost every computer users,
SEWOS Text editor enables creating, viewing, that's why SEWOS includes this application.
editing, formatting, annotating, printing and Application's interface pretty much resembles
saving text files. The application interface can the basic interface of web browser. With an
be shown in Fig.4, this contains: address bar to write required URL and Go
1) Clipboard section button to navigate directly to it. This
This section includes buttons that provides application includes as well an interface to our
the basic copy, cut and paste functions. developed personalized semantic search engine
2) Font section (PSSE) using a search button. Web browser
This section includes buttons that provides and search application are both shown in Fig.5
the main formatting options. This includes (a, b).
changing fonts, font size, color and
alignment of the selected text.
3) Insert section
3.6 SEWOS GAMING built-in gaming application for the sake of user's
Gaming and entertainments gain importance to user entertainment. This application is shown in Fig.8.
during his breaks and leisure times. SEWOS has a
15
4.1 QUESTIONNAIRE 100%
10
Twenty five experienced users responded to 80%
the questionnaire assessing the overall usability of 5 60%
the system. Questionnaire consists of forty four 40%
usability questions to which the respondent was to 0 20%
ty
r
r
ng
r
ito
eo
se
ge
da
ali
Vid
row
na
len
on
ga
Ma
xt
Ca
bb
nc
Te
e
Fu
we
Abstract
Information hiding is a technique of hiding secret using (ii) Masking and filtering
redundant cover data such as images, audios, movies, documents, (iii) Transform techniques
etc. In this paper, a new technique of hiding secret data using
LSB insertion is proposed, by using the RGB channels of the Information hiding is an emerging research area, which
cover image for hiding segmented data. One of the three
channels became the index to the two other channels. Firstly, the
encompasses applications such as copyright protection for
secret data are segmented into Even segment and Odd segment. digital media, watermarking, fingerprinting, and
Then, four bits of each segment is hidden separately inside the steganography [8]. All these applications of information
two channels depending on the numbers of "1"s inside the index hiding are quite diverse [8] and many encoding methods
channel. If the numbers were Even, then four bits of Even was proposed, a reversible image hiding scheme based on
segment will be hidden. However, if they were Odd then four histogram shifting for medical images was proposed in [5].
bits of Odd segment will be hidden. The opposite process An image-in-image hiding scheme, based on dirty-paper
retrieve the secret data from image by reading the bits of the coding, that is robust to JPEG and additive white Gaussian
index channel and check the numbers of "1"s to extract the Even noise (AWGN) attacks was proposed in [9]. Chen et al.
segment and Odd segment. Finally, recombining the two
[10] used a vector quantification method, but the method
segments to extract the secret data. Experimental results show
that the proposed method can provide high data security with required a set of look-up tables. Moreover, the decoded
acceptable stego-images. images were little distorted from original images. Wang et
Keywords: Steganography, Data hiding, Data segmenting, al. [11] proposed a least significant bit technique to hide
Index channel information. The technique could improve the visual
quality of cover images, but the reconstruction processes
were very complicated calculations. Chang et al. [12]
1. Introduction proposed two kinds of hiding techniques and the hiding
techniques secured better visual quality. However, the
Information security requirement became more important, information capacity of these hiding techniques was low.
especially after the spread of Internet applications [1]. Yang and Lin [13] used a basal-bit orientation method to
However, Owners of sensitive documents and files must hide images, and the method had large hiding capability
protect themselves from unwanted spying, copying, theft and good visual quality of the secret image. In this paper
and false representation. This problem has been solved by we proposed a new method of segmenting and hiding the
using a technique named with the Greek word secret data in bmp color image by segmenting these data
steganography it is mean hiding information [2]. into two segment, i.e. Even segment and Odd segment.
Steganography is the art and science of hiding information. Then those two segments of characters will be hidden
The data-hiding system design challenge is to develop a separately and randomly inside the cover image. By using
scheme that can embed as many message bits as possible random pixels to insertion secret data with modifying
while preserving three properties: imperceptibility, those data, this could avoid the detection by comparison of
robustness, and security [4].In addition, proposing an modified image with original image [3]. Two channels
effective method for image hiding is an important topic in were used for hiding data in 24-bit BMP image and the
recent years [5],[6].There have been many techniques for third channel was used as index channel for the hidden
hiding information or messages in images in such a way data.
that the alterations made to the image are perceptually
indiscernible. Common approaches include [7]:
(i) Least significant bit insertion (LSB)
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 523
Fig. 3b: Stego image (Mosul City) with text size 1420
characters
Fig. 2 Recovery and Recombine process flowchart
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 526
Fig. 4b: Stego image (Bird) length 760x570 with text length
2300
where References
M. N are the row and column numbers of the cover image, [1] J. Anitha and S. Immanuel Alex Pandian" A Color Image
fij is the pixel value from the cover image, Digital Watermarking Scheme Based on SOFM"
gij is the pixel value from the stego-image, and International Journal of Computer Science Issues, Vol 7,
Issue 5, Sept 2010, Pages 302-309.
L is the peak signal level of the cover image (for &bit [2] Mayank Srivastava, Mohd. Qasim Rafiq and Rajesh Kumar
gray-scale images, L = 255). Tiwari "A Robust and Secure Methodology for Network
Communications" International Journal of Computer Science
Table 2 shows the values of PSNR and MSE with Issues, Vol. 7, Issue 5, September 2010, Pages 135-141
different sizes of images. Referring to Table 2, the [3] Geeta S. Navale ; Swati S. Joshi ; Aarad_ hana A Deshmukh
"M-Banking Security a futuristic improved security
column labeled SHDRIC is our proposed
approach", International Journal of Computer Science Issues,
Vol 7, Issues 1, Jan 2010,Pag. 68-71
[4] I. Cox, M. Miller, and J. Bloom, Digital Watermarking,
Table 2: (PSNR and MSE) of four sample images Academic Press, 2002.
[5] P. Tsai, etal. Reversible image hiding scheme using
predictive coding and histogram shifting, Signal Processing,
vol. 89. pp. 1129-1143, 2009.
[6] H. Sajedi, M. Jamzad, Cover Selection Steganography
method Based on Similarity of Image Blocks, in Proc. of Int.
IEEE 8th Conference on Computer and Information
Technology, 2008.
[7] N.F. Johnson and S. Jajodia, Exploring Steganography:
Seeing the Unseen, IEEE, pp. 26-34, 1998.
[8] R A Isbell, Steganography: Hidden Menace or Hidden
Savior, Steganography White Paper, 10 May 2002.
[9] K. Solanki, N. Jacobsen, U. Madhow, B.S. Manjunath, and S.
Chandrasekaran, Robust image-adaptive data hiding using
erasure and error correction, IEEE Trans. Image Processing,
From the table, it was noted that the increase in the text vol. 13,pp.16271639, Dec. 2004.
[10] T. S. Chen, C. C. Chang, and M. S. Hwang, A virtual image
caused an increase in the MSE and decrease in the
cryptosystem based on vector quantization, IEEE Trans. on
PSNR. However, we can see the improvement in PSNR Image Process, 7 (1998) 1485.
values than in the simple LSB. So, it becomes difficult to [11] R. Z. Wang, C. F. Lin, and J. C. Lin, Image hiding by LSB
discover the hidden text within the image. substitution and genetic algorithm, Pattern Recogn., 34
(2001) 671.
[12] C. C. Chang, J. C. Chung, and Y. P. Lau, Hiding data in
5. Conclusion multitone images for data communications, IEE Proc. of
Vision Image Signal Process, 151 (2004) 137.
[13] C. Y. Yang and J. C. Lin, Image hiding by base-oriented
The suitability of steganography as a tool to conceal algorithm,Optical Eng-ineering, 45 (2006) Paper No. 117001
highly sensitive data has been discussed using a new [14] Peter Wayner,Disappearing Cryptography Information
method of randomizing the secret data. The method is Hiding: Steganography & WatermarkingSecond Edition.
San Fransisco, California, U.S.A.: Elsevier Science, 2002,
based on two level of security where the data will be
ISBN 1558607692.
segmented into even and odd segments, before hiding [15] Neil F. Johnson and Sushil Jajodia, Exploring
the two segments separately and randomly inside image. Steganography: Seeing the unseen IEEE transaction on
This suggests that an image containing encrypted data Computer Practices. 1998.
can be transmitted anywhere across the world, in a [16] Ross Anderson, Roger Needham, Adi Shamir, The
complete secured form. This method can use in any Steganographic File System, 2nd Information Hiding
other application such as image watermarking. It can be Workshop, 1998.
concluded that randomizing and hiding the secret data
can provide a double layer of protection.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 529
Emad T. Khalaf
Graduated in Computer Information Systems
and Informatics Engineering and he worked
as a Technical in Internet Services Company
for more than nine years. He had experience
as a trainer for various computer courses.
His research interests include network
technology and security. He is currently
studying MSc degree in the area of
computer networks security.
Norrozila Sulaiman
Graduated from Sheffield Hallam University
with a BSc (Hons) in Computer Studies in
1994.She worked with Employment Service
in UK as a network support assistant and
she involved on a research on Novell
Netware. After graduated, she worked as a
research officer at Artificial Intelligence
System and Development Laboratory and
involved in joint collaboration projects
between the government of Malaysia and
Japan for about 5 years. She completed her MSc degree in
Information Technology and involved in a research on
Wireless Application Protocol (WAP). She obtained her PhD
degree in mobile communication and networks from Newcastle
University in UK. Currently, she is a senior lecturer at Faculty
of Computer System and Software Engineering, University
Malaysia Pahang. Her main research interests include
heterogeneous networks, mobile communication networks and
information security.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 530
2
MAEERS MIT College of Engineering,
Pune-38,Maharashtra, India.
Abstract
Data-acquisition involves gathering signals from measurement market place compared to its competitors. Analysis helps
sources and digitizing the signal for storage, analysis and the share broker to carefully study the behaviour of the
presentation on a PC. Analysis and prediction is very necessary in
todays market for the accurate utilization of funds at hand. For
stocks and utilize his funds in a more veracious way.
analysis, there has to a proper system where in the required data is Analysis of stocks takes into consideration the past
first acquired from the destination. This data then needs to be behaviour of the particular stock and analysis shown to the
analysed using any analysis model. Currently there are many user in the form of graphs. These graphs can be represented
analysis models available in the market. These models are based in a number of ways depending on the preferences of the
on the past behaviour of the stocks. However, it is seen that there users.
is no model which predicts the future behaviour of the stocks. For
this reason, a model is developed which not only analyses the In this paper, the share market analysis and prediction
stocks but also predicts its future behaviour based on the past
model is proposed. This model is established using a
conduct.
reliable data-acquisition system which acquires data from
Keywords: Data-acquisition, share-market analysis, share-
market predictions.
the internet. This data is then analysed using the analysis
module. After analysing the data the prediction module
starts working. It does its calculations and the resulting
1. INTODUCTION predictions are recorded in table format and are reflected on
the graphs.
Data-Acquisition systems are in great demand in the
industry and consumer applications. Data-acquisition
systems are defined as any instrument or computer that 2. RELATED WORK
acquires data from sensors via amplifiers, multiplexers, and
any necessary analog to digital converters or the internet. There are data-acquisition and control devices that
The system then returns data to a central location for further will be a substitute for a supervisor in a multisite job
processing. An acquisition unit is designed to collect data in operation. A single person can monitor and even interact
their simplest form from the internet. with the ongoing work from a single base station. An
acquisition unit designed to collect data in their simplest
Now-a-days Data-Acquisition systems are used more form is detailed in [1]. Data collection via wireless internet-
and more as these systems provide precise accuracy. Also, based measurement architecture for air quality monitoring
these systems remove the overhead of constant monitoring. is discussed in [2]. Some applications adding remote
A single person can monitor the entire system and also accessibility are detailed in [3] and [4], which are built to
interact with the system if required. These systems enable collect and send data through a modem to a server. Some
the user to analysis the acquired data and also produce applications have integrated systems for data-acquisitions.
required predictions. Data-Acquisitions can be a very One such system is used in [5].
tedious task or even virtually impossible if these systems
were not in place. These systems have allowed us to make There are a number of analysis models that are
more accurate, reliable and fool-proof data sharing, data available. These models provide analysis as desired by the
analysis and data collection. user. One such model is discussed in [6]. This is stock
market software, which supports multiple countries' stock
Share-Market Analysis is an important part of market market. (11 countries at this moment) It provides Real-
analysis and indicates how well a firm is doing in the Time stock info, Stock indicator editor, Stock indicator
scanner, Portfolio management and Market chit chat
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 531
features. One more such model is shown in [7]. It provides
a free web based stock price analysis module. The easy to
use interface incorporates Fundamental Analysis to
calculate: Fair Value stock price; comparative stock Value;
profit Target sell price; Stop Loss sell price; Price Earnings
Ratio (PE) for Fair Value and Buy prices; stock Return on
Investment %; and provides access to Technical Analysis
charts to evaluate stock movements and buy/sell signals.
3. PROPOSED SOFTWARE Fig. 2. Sample file showing highest lowest and close price of stocks for a
particular day
In the proposed software, the real-time data from the
3.2 Data-analysis
share market is taken from the internet. This data is then
This model starts its work once the data-acquisition
processed and analysed. After analysis, predictions for each
process has finished. Data-analysis reports can be made and
stock are calculated using formulas. Thus the proposed
shown to the users in a number of ways. In the proposed
software is divided into three main modules viz (3.1) Data-
software, reports for the weekly, monthly and yearly
acquisition. (3.2) Data-analysis and (3.3) Prediction Model.
highest, lowest and average prices are shown to the user in
an excel sheets. The user can also directly see the graphs of
3.1 Data-acquisition all these values. A sample report file is shown in Fig 3. A
The data for the stocks in the market is acquired from sample graph is also shown in Fig 4.
the internet. This data comes in the DBF file form. A snap-
shot of this file is shown in Fig1.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 532
4. EXPERIMENTS & RESULTS
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 533
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 534
5. CONCLUSION
REFERENCES
[1] K. Jacker and J. Mckinney, TkDASA data acquisition
system usingRTLinux, COMEDI, and Tcl/Tk, in Proc. Third
Real-Time Linux Workshop, 2001. [Online]. Available: The Real
Time Linux Foundation:
http://www.realtimelinuxfoundation.org/events/rtlws-
2001/papers.html
[2] A. Sang, H. Lin, and C. E. Y. Z. Goua, Wireless Internet-
based measurement architecture for air quality monitoring, in
Proc. 21st IEEE IMTC, May 1820, 2004, vol. 3, pp. 19011906.
[3] W. Kattanek, A. Schreiber, and M. Gtze, A flexible and
Fig. 15. Prediction graph - 1 cost-effective open system platform for smart wireless
communication devices, in Proc. ISCE, 2002.
[4] J. E. Marca, C. R. Rindt, and M. G. Mcnally, The tracer data
As it is seen, Fig 13 and Fig 14, it shows graphs for collection system: Implementation and operational experience,
analysis of a particular stock. But these graphs only take Inst. Transp. Studies, Univ. California, Irvine, CA, Uci-Its-As-
into consideration the past behaviour of the stocks. They Wp-02-2, 2002.
show no predictions. Whereas the graph shown in fig 15 [5] M. A. Al-Taee, O. B. Khader, and N. A. Al-Saber, Remote
shows the past behaviour as well as predict the future of the monitoring of vehicle diagnostics and location using a smart box
stock. with Global Positioning System and General Packet Radio
Service, in Proc. IEEE/ACS AICCSA, May 1316, 2007, pp.
385388.
[6] JStock - Stock Market Software 1.02,
http://www.topshareware.com/download.aspx?id=67171&p=&url
=https%3a%2f%2ffanyv88.com%3a443%2fhttp%2fdownloads.sourceforge.net%2fjstock%2fjstock-
1.0.2-setup.exe%3fbig_mirror%3d0
[7] stock price analysis 1,
http://www.topshareware.com/download.aspx?id=77845&p=&url
=https%3a%2f%2ffanyv88.com%3a443%2fhttp%2fwww.stockpriceanalysis.com%2fspa.exe
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 535
2. Department of Electronics and Communication Engg, Institute of Engg. & Managment college, saltlake, Kolkata-700091.
3. Department of Computer Science and Engg, University of Kalyani, Nadia, West Bengal, Pin- 741235.
Abstract
Due to rapid growth in IEEE 802.11 based Wireless Local Area
Networks (WLAN), handoff has become a burning issue. A
mobile station (MS) requires handoff when it travels out of the
coverage area of its current access point (AP) and tries to
associate with another AP. But handoff delays provide a serious
barrier for such services to be made available to mobile platforms.
Throughout the last few years there has been plenty of research
aimed towards reducing the handoff delay incurred in the various
levels of wireless communication. In this paper we propose a
method using the GPS(Global Positioning System) to determine
Figure 1. Handoff process
the positions of the MS at different instants of time and then by
fitting a trend equation to the motion of the MS to determine the
potential AP(s) where the MS has maximum probability of For successful implementation of seamless Voice over IP
travelling in the future. This will result in a reduction of number communications, the handoff latency should not exceed
of APs to be scanned as well as handoff latency will be reduced to 50ms.It has been observed that in practical situations
a great extent. handoff takes approximately 200-300 ms to which
Keywords: IEEE 802.11, Handoff latency, GPS (Global scanning delay contributes almost 90%.This is not
Positioning System), Regression, Neighbor APs. acceptable and thus the handoff latency should be
minimized.
Three strategies have been proposed to detect the need for
1. Introduction hand off[1]:
1)mobile-controlled-handoff (MCHO):The mobile
IEEE 802.11 based wireless local area network (WLAN) station(MS) continuously monitors the signals of the
are widely used in domestic and official purpose due to its surrounding base stations(BS)and initiates the hand off
flexibility of wireless access. However, WLANs are
process when some handoff criteria are met.
restricted in their diameters to campus, buildings or even a
2)network-controlled-handoff (NCHO):The surrounding
single room. Due to the limited coverage areas of different
BSs measure the signal from the MS and the network
APs a MS has to experience handoff from one AP to
another frequently. initiates the handoff process when some handoff criteria are
met.
1.1 Handoff 3)mobile-assisted-handoff (MAHO):The network asks
the MS to measure the signal from the surrounding BSs.the
When a MS moves out of reach of its current AP it must be network make the handoff decision based on reports from
reconnected to a new AP to continue its operation. The the MS.
search for a new AP and subsequent registration under it
constitute the handoff process which takes enough time Handoff can be of many types:
(called handoff latency) to interfere with proper Hard Handoff: In this process radio link with old AP is
functioning of many applications. broken before connection with new AP. This in turn results
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 536
Re-association Response
other fields. A GPS receiver is able to calculate its position IAPP. Moreover, these processes involve channel scanning
by precisely timing the signals sent by the GPS satellites. of all neighboring APs and do not consider the position or
The receiver uses the received messages from the satellites velocity of MS to select potential APs. Hence these
to determine the transit time of each message and methods are more power consuming and are less effective
calculates the distance to each satellite. These distances are for reducing handoff.
then utilized to compute the position of the receiver. For
normal operation a minimum of four satellites are
necessary. Using the messages received from the satellites 3. Proposed Works
the receiver is able to calculate the times sent and the
satellite positions corresponding to these points. Here we propose a method depending upon statistical
Each MS is equipped with a GPS receiver which is used to regression to minimize the handoff delay. We will select
determine the positions of the MS at different instants of the potential APs where the MS has maximum probability
time. This will provide knowledge about the MSs of travelling when it moves out of the coverage area of its
movement within 1 to 2 meter precision. present AP. Thus we will minimize handoff delay by
Scanning only the potential APs for available channels. .
We implement our method with the help of GPS. We
present the method in the following four sections:
a0=[yiti2-tiyiti]/[nti2-(ti)2] ..(3)
Figure.8
Figure.11
Figure.12
Hence, = 6.086 m
and = 11.579 m
Figure.9 Thus, min= tan-1(y/-/x/+)= 30.447(degree)
and max= tan-1(y/+ /x/-)= 46.510(degree)
This indicates that the MS is moving towards AP2 as the
expected range of angle lies between 0 and 60 degrees.
We made 100 such sample runs by varying various
parameters like mobility range and velocity of MS, cell
coverage area, etc. In 89% of the cases one potential AP
was selected while in 9% cases two potential APs were
selected. The remaining 2% constituted cases where
potential APs selected by the proposed algorithm resulted
in association failure leading to an efficient full scanning of
the channels of other APs (approx 30-40 ms). Taking the
round trip time (rtt) as 3 msec the average handoff latency
measured was 6.563 msec which is a drastic improvement
Figure 10 in comparison to earlier proposed methods. The graph of
Hence the parameters computed were this simulation is plotted in Fig.12, which shows the
x/ = 71.089 m, y/ = 56.943 m and = 38.695 (degree) various handoff delay times in the Y-axis in msec, for each
The deviations of predicted from actual positions of MS experiment, which is shown in the X-axis.
obtained by GPS were plotted for both axis. The success of our simulation clearly depicts the
applicability of our proposed algorithm.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 541
5. Conclusion [9] Chien-Chao Tseng, K-H Chi, M-D Hsieh and H-H Chang,
Location-based Fast Handoff for 802.11 Networks,IEEE
Our proposed method aims at reducing handoff time by Communication Letters, Vol-9, No 4,pp 304-306, April 2005.
reducing the number of APs to be scanned which is [10] Yogesh Ashok Powar and Varsha Apte, Improving the IEEE
accomplished by fitting a trend equation to the motion of 802.11 MAC Layer Handoff Latency to Support Multimedia
the MS. This in turn reduces the number of channels to be Traffic, Wireless Communications and Networking
scanned which brilliantly reduces the handoff delay as is Conference, 2009. WCNC 2009. IEEE Xplore, pp 1-6, April
clear from the simulation presented in the above section. In 2009
the proposed algorithm a linear trend equation has been
fitted because it is the most common and trustworthy fit.
However higher order polynomials may also be used for
fitting and best fit may be chosen by comparing the norm Debabrata Sarddar is currently pursuing his PhD
of the residuals. at Jadavpur University. He completed his M.Tech in
However the proposed algorithm may prove erroneous if Computer Science & Engineering from DAVV,
Indore in 2006, and his B.Tech in Computer
the motion of the MS is too much random to be used for Science & Engineering from Regional Engineering
prediction purposes. Future works in this field may include College, Durgapur in 2001. His research interest
research on more refined algorithms regarding curve fitting includes wireless and mobile system.
and prediction. Error estimation method may also be
Shubhajeet Chatterjeee is presently pursuing
improved. It is worth mentioning here that although the B.Tech Degree in Electronics and
proposed work has been presented considering honeycomb Communication Engg. at Institute of Engg. &
structures yet our algorithm would work in a similar Managment College, under West Bengal
University Technology. His research interest
manner for other cell structures and neighbor AP locations. includes wireless sensor networks and wireless
Minor changes would be introduced depending on the communication systems.
network topology.
Ramesh Jana is presently pursuing M.Tech
(2nd year) in Electronics and
References Telecommunication Engg. at Jadavpur
[1] Yi-Bing Lin Imrich Chalmatc, Wireless and Mobile Network University. His research interest includes
Architectures, pp. 17. wireless sensor networks, fuzzy logic and
[2] AKYILDIZ, I. F., XIE, J., and MOHANTY, S., "A survey on wireless communication systems
mobility management in next generation all-IP based wireless
Hari Narayan Khan is presently pursuing M.Tech
systems," IEEE Wireless Communications, vol. 11, no. 4, pp. (Final year) in Computer Technology at Jadavpur
16-28, 2004. University. He completed his B.Tech in
[3] STEMM, M. and KATZ, R. H., "Vertical handoffs in wireless Electronics & Communication Engineering in
overlay networks," ACM/Springer Journal of Mobile 2006 from Institute of Technology & Marine
Networks and Applications(MONET), vol. 3, no. 4, pp. 335- Engineering under West Bengal University of
350, 1998. Technology. His research interest includes
wireless and mobile system.
[4] J. Pesola and S. Pokanen, Location-aided Handover in
Heterogeneous Wireless Networks, in Proceedings of Mobile
Location Workshop, May 2003. SHAIK SAHIL BABU
is pursuing Ph.D in the Department of Electronics
[5] M. Shin, A. Mishra, and W. Arbaugh, Improving the Latency and Telecommunication Engineering under the
of 802.11 Hand-offs using Neighbor Graphs, in Proc. ACM supervision of Prof. M.K. NASKAR at Jadavpur
MobiSys 2004,pp 70-83, June 2004. University, KOLKATA. He did his Bachelor of
Engineering in Electronics and
[6] S. Shin, A. Forte, A. Rawat, and H. Schulzrinne, Reducing
Telecommunication Engineering from Muffa Kham
MAC Layer Handoff Latency in IEEE 802.11 Wireless Jah College of Engineering and Technology,
LANs, in Proc. ACM MobiWac 2004, pp 19-26, October Osmania University, Hyderabad, and Master of
2004. Engineering in Computer Science and Engineering from Thapar
Institute of Engineering and Technology, Patiala, in Collaboration
[7] H.-S. K. et. al. Selecive channel scanning for fast handoff in with National Institute of Technical Teachers Training and
wireless LAN using neighbor graph, International Technical Research, Chandigarh.
Conference on Circuits/Systems, Computer and
Utpal Biswas received his B.E, M.E and PhD degrees in
Communications. LNCS Springer, Vol 3260, pp 194-203, Computer Science and Engineering from Jadavpur University, India
2004. in 1993, 2001 and 2008 respectively. He served as
[8] S. Park and Y. Choi. Fast inter-ap handoff using predictive- a faculty member in NIT, Durgapur, India in the
department of Computer Science and Engineering
authentication scheme in a public wireless lan. Networks2002 from 1994 to 2001. Currently, he is working as an
(Joint ICN 2002 and ICWLHN 2002), August 2002. associate professor in the department of Computer
Science and Engineering, University of Kalyani,
West Bengal, India. He is a co-author of about 35
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 542
3
Department of Information Technology
Jadavpur University
Kolkata, West Bengal, India
Abstract
Visual Cryptography is a special type of encryption technique to
obscure image-based secret information which can be decrypted
Human visual system acts as an OR function. Two
by Human Visual System (HVS). This cryptographic system
encrypts the secret image by dividing it into n number of shares
transparent objects stacked together, produce transparent
and decryption is done by superimposing a certain number of object. But changing any of them to non-transparent, final
shares(k) or more. Simple visual cryptography is insecure objects will be seen non-transparent. In k-n secret sharing
because of the decryption process done by human visual system. visual cryptography scheme an image is divided into n
The secret information can be retrieved by anyone if the person number of shares such that minimum k number of shares
gets at least k number of shares. Watermarking is a technique to is sufficient to reconstruct the image. The division is done
put a signature of the owner within the creation. by Random Number generator [4].
In this current work we have proposed Visual Cryptographic This type of visual cryptography technique is insecure as
Scheme for color images where the divided shares are enveloped the reconstruction is done by simple OR operation.
in other images using invisible digital watermarking. The shares
are generated using Random Number.
To add more security to this scheme we have proposed a
Keywords: Visual Cryptography, Digital Watermarking, technique called digital enveloping. This is nothing but an
Random Number. extended invisible digital watermarking technique. Using
this technique, the divided shares produced by k-n secret
sharing visual cryptography are embedded into the
1. Introduction envelope images by LSB replacement [5]. The color
change of the envelope images are not sensed by human
Visual cryptography is a cryptographic technique where eye [ 6]. (More than 16.7 million i.e.224 different colors
visual information (Image, text, etc) gets encrypted in are produced by RGB color model. But human eye can
such a way that the decryption can be performed by the discriminate only a few of them.). This technique is
human visual system without aid of computers [1]. known as invisible digital watermarking as human eye
Like other multimedia components, image is sensed by can not identify the change in the envelope image and the
human. Pixel is the smallest unit constructing a digital enveloped (Produced after LSB replacement) image [7].
image. Each pixel of a 32 bit digital color image are In the decryption process k number of embedded
divided into four parts, namely Alpha, Red, Green and envelope images are taken and LSB are retrieved from
Blue; each with 8 bits. Alpha part represents degree of each of them followed by OR operation to generated the
transparency. original image.
A 32 bit sample pixel is represented in the following In this paper Section 2 describes the Overall process of
figure [2] [3]. Operation, Section 3 describes the process of k-n secret
sharing Visual Cryptography scheme on the image,
11100111 11011001 11111101 00111110 Section 4 describes the enveloping process using invisible
digital watermarking, Section 5 describes decryption
process, Section 6 describes the experimental result, and
Alpha Red Green Blue Section 7 draws the conclusion.
Fig 1: Structure of a 32 bit pixel
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 544
2. Overall Process Step II: Take the number of shares (n) and minimum number
of shares (k) to be taken to reconstruct the image where k must
Step I: The source image is divided into n number of be less than or equal to n. Calculate RECONS = (n-k)+1.
shares using k-n secret sharing visual cryptography Step III: Create a three dimensional array
scheme such that k number of shares is sufficient to IMG_SHARE[n][w*h][32] to store the pixels of n
reconstruct the encrypted image. number of shares. k-n secret sharing visual cryptographic
Step II: Each of the n shares generated in Step I is division is done by the following process.
embedded into n number of different envelope images
using LSB replacement. for i = 0 to (w*h-1)
{
Step III: k number of enveloped images generated in Step Scan each pixel value of IMG and convert it into 32 bit
II are taken and LSB retrieving with OR operation, the binary string let PIX_ST.
original image is produced. for j = 0 to 31
{ if (PIX_ST.charAt(i) =1){
The process is described by Figure 2 call Random_Place (n, RECONS)
}
3. k-n Secret Sharing Visual Cryptography for k = 0 to (RECONS1)
Scheme {
Set IMG_SHARE [RAND[k]][i][j] = 1
An image is taken as input. The number of shares the
}
image would be divided (n) and number of shares to
}
reconstruct the image (k) is also taken as input from user.
}
The division is done by the following algorithm.
Step I: Take an image IMG as input and calculate its
width (w) and height (h).
Secret Sharing with Digital Enveloping
k-n secret
sharing visual
L S B
Share 1 Envelope 1 Enveloped
cryptography
Image 1
R E P L A C E M E N T
Share 2 Envelope 2 Enveloped
Original
Image 2
Image
Share n Envelope n Enveloped
Image n
Enveloped
Image 1 Decryption Process
k number of Original
Enveloped Enveloped Image
Image 1 Images
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 545
Step IV: Create a one dimensional array IMG_CONS[n] needed. In the same way red, green and blue part are
to store constructed pixels of each n number of shares by enveloped in three other pixels of the envelope image.
the following process. The enveloping is done using the following algorithm
for k1 = 0 to (n-1)
{ for k2 = 0 to (w*h-1) Step I: Take number of shares (n) as input.
{ String value= for share = 0 to n-1 follow Step II to Step IV.
for k3 = 0 to 31 {
value = value+IMG_SHARE [k1][k2][k3] Step II: Take the name of the share, let SHARE_NO (NO
} is from 0 to n-1) and name of the envelope, let
Construct alpha, red, green and blue part of each pixel ENVELOPE_NO (NO is from 0 to n-1) as input. Let the
by taking consecutive 8 bit substring starting from 0. width and height of each share are w and h. The width of
Construct pixel from these part and store it into the envelope must be 4 times than that of SHARE_NO.
IMG_CONS[k1] [4]. Step III: Create an array ORG of size w*h*32 to store
} the binary pixel values of the SHARE_NO using the loop
Generate image from IMG_CONS [k1]1 [8]. for i = 0 to (w*h-1)
} { Scan each pixel value of the image and convert it into
subroutine int Random_Place(n, RECONS) 32 bit
{ Create an array RAND[RECONS] to store the binary string let PIX
generated random number. for j = 0 to 31
for i = 0 to (recons-1) { ORG [i*32+j] = PIX.charAt(j)
{ }
Generate a random number within n, let rand_int. [9] }
if (rand_int is not in RAND [RECONS]) Create an array ENV of size 4*w*h*32 to store the binary
RAND [i] = rand_int pixel values of the ENVELOPE_NO using the previous
} loop but from i = 0 to 4*w*h*32 1.
return RAND [RECONS] Step IV: Take a marker M= 1. Using the following
} process the SHARE_NO is embedded within
4. Enveloping Using Invisible Digital ENVELOPE_NO.
Watermarking for i = 0 to 4*w*h 1
Using this step the divided shares of the original image {
are enveloped within other image. Least Significant Bit ENV [i*32+6] = ORG [++M];
(LSB) replacement digital watermarking is used for this ENV [i*32+7 ] = ORG [++M];
enveloping process. It is already discussed that a 32 bit ENV [i*32+14] = ORG [++M];
digital image pixel is divided into four parts namely ENV [i*32+15] = ORG [++M];
alpha, red, green and blue; each with 8 bits. Experiment ENV [i*32+22] = ORG [++M];
shows that if the last two bits of each of these parts are ENV [i*32+23] = ORG [++M];
changed; the changed color effect is not sensed by human ENV [i*32+30] = ORG [++M];
eye[6]. This process is known as invisible digital ENV [i*32+31] = ORG [++M];
watermarking [7]. For embedding 32 bits of a pixel of a }
divided share, 4 pixels of the envelope image is Construct alpha, red, green and blue part of each pixel by
necessary. It means to envelope a share with resolution w taking consecutive 8 bit substring starting from 0.
X h; we need an envelope image with w X h X 4 pixels. Construct pixel from these part and store it into a one
Here we have taken each envelope of size 4w X h. dimensional array let IMG_CONS of size 4*w*h [4].
The following figure describes the replacement process. }
For replacing 8 bit alpha part, a pixel of the envelope is Generate image from IMG_CONS [ ]1.
Fig 3: Enveloping Process
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 546
c5 = c5 | STORE [SH_NO] [i*32+23];
5. Decryption Process c6 = c6 | STORE [SH_NO] [i*32+30];
c7 = c7 | STORE [SH_NO] [i*32+31];
In this step at least k numbers of enveloped images are }
taken as input. From each of these images for each pixel,
the last two bits of alpha, red, green and blue are retrieved FINAL [++M] =c0;
and OR operation is performed to generate the original FINAL [++M] = c1;
image. It is already discussed that human visual system FINAL [++M] = c2;
acts as an OR function. For computer generated process; FINAL [++M] = c3;
OR function can be used for the case of stacking k FINAL [++M] = c4;
number of enveloped images out of n. FINAL [++M] = c5;
FINAL [++M] = c6;
The decryption process is performed by the following FINAL [++M] = c7;
algorithm. }
Step I: Input the number of enveloped images to be taken Create a one dimensional array IMG_CONS[ ] of size
(k); height (h) and width (w) of each image. (w/4)*h to store constructed pixels.
Construct alpha, red, green and blue part of each pixel by
Step II: Create a two dimensional array STORE[k taking consecutive 8 bit substring from FINAL[ ] starting
][w*h*32 ] to store the pixel values of k number of from 0.
enveloped images. Create a one dimensional array Construct pixel from these part and store it into
FINAL[(w/4)*h*32] to store the final pixel values of the IMG_CONS[(w/4)*h]
image which will be produced by performing bitwise OR Generate image from IMG_CONS[ ].
operation of the retrieved LSB of each enveloped images.
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 547
2img.png 3img.png
Fig 5: Encrypted Shares
Enveloping using Watermarking:
0img.png Envelope0.png
Final0.png
Envelope1.png
1img.png
Final1.png
Envelope2.png
2img.png
Final2.png
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 548
3img.png Envelope3.png
Final3.png
Decryption Process:
Number of enveloped images taken: 3
Name of the images: Final0.png, Final2.png, Final3.png
LSBRETRIEVEWITHOROPEARTION
Fig 7: Decryption Process
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 549
7. Conclusion [7] S. Craver, N. Memon, B. L. Yeo, and M. M. Yeung.
Resolving Rightful Ownerships with Invisible
Decryption part of visual cryptography is based on OR Watermarking Techniques: Limitations, Attacks and
Implications. IEEE Journal on Selected Areas in
operation, so if a person gets sufficient k number of
Communications, Vol16, No.4 May 1998, pp.573586,.
shares; the image can be easily decrypted. In this current [8] Schildt, H. The Complete Reference Java 2, Fifth Ed. TMH,
work, with well known k-n secret sharing visual Pp 799-839
cryptography scheme an enveloping technique is [9] Krishmoorthy R, Prabhu S, Internet & Java Programming,
proposed where the secret shares are enveloped within New Age International, pp 234.
apparently innocent covers of digital pictures using LSB [10] F. Liu1, C.K. Wu1, X.J. Lin, Colour visual cryptography
replacement digital watermarking. This adds security to schemes, IET Information Security, July 2008.
visual cryptography technique from illicit attack as it [11] Kang InKoo el. at., Color Extended Visual Cryptography
befools the hackers eye. using Error Diffusion, IEEE 2010.
[12] SaiChandana B., Anuradha S., A New Visual Cryptography
The division of an image into n number of shares is done
Scheme for Color Images, International Journal of
by using random number generator, which is a new Engineering Science and Technology, Vol 2 (6), 2010.
technique not available till date. This technique needs [13] Li Bai , A Reliable (k,n) Image Secret Sharing Scheme by,
very less mathematical calculation compare with other IEEE,2006.
existing techniques of visual cryptography on color
images [10][11][12][13]. This technique only checks 1
at the bit position and divide that 1 into (n-k+1) shares Appendix:
using random numbers. A comparison is made with the 1
proposed scheme with some other schemes to prove the Java Language implementation is
novelty of the scheme. int c=0;
int a=(Integer.parseInt(value.substring(0,8),2))&0xff;
Table 1: Margin specifications int r=(Integer.parseInt(value.substring(8,16),2))&0xff;
Other Processes Proposed Scheme int g=(Integer.parseInt(value.substring(16,24),2))&0xff;
1. k-n secret sharing process is 1. k-n secret sharing process is int b=(Integer.parseInt(value.substring(24,32),2))&0xff;
Complex[10][11][12]. simple as random number is img_cons[c++]=(a << 24) | (r<< 16) | (g << 8) | b;
used.
2. The shares are sent through 2. The shares are enveloped
different communication into apparently innocent cover
channels, which is a concern to of digital pictures and can be
security issue [10][11][12][13].
sent through same or different
communication channels.
Invisible digital watermarking
befools the hacker.
References:
[1] M. Naor and A. Shamir, Visual cryptography, Advances in
Cryptology-Eurocrypt94, 1995, pp. 112.
[2] P. Ranjan, Principles of Multimedia, Tata McGraw Hill,
2006.
[3] John F Koegel Buford, Multimedia Systems, Addison
Wesley, 2000.
[4] Kandar Shyamalendu, Maiti Arnab, K-N Secret Sharing
Visual Cryptography Scheme For Color Image Using
Random Number International Journal of Engineering
Science and Technology, Vol 3, No. 3, 2011, pp. 1851-1857.
[5] Naskar P., Chaudhuri A, Chaudhuri Atal, Image Secret
Sharing using a Novel Secret Sharing Technique with
Steganography, IEEE CASCOM, Jadavpur University,
2010, pp 62-65.
[6] Hartung F., Kuttter M., Multimedia Watermarking
Techniques, IEEE, 1999.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 550
2
Professor of Computer Science & Engg, Geethanjali College of Engineering & Technology, Hyderabad, India
3
Professor of Computer Science & Engg, JNTUH, Hyderabad, India
4
Professor of Computer Science & Engg, University of Hyderabad, Hyderabad, India
Abstract
for a stable network. Also, breakdown of certain heavily. Multipath routing scheme has more
links results in routing decisions to be made again.
advantages than unipath routing on the aspect
2. Limited battery / energy factor: Mobile nodes of fault-tolerance, routing reliability and
are battery driven. Therefore, the energy resources
for such networks are limited. Also, the battery network load balance [3]. To improve the
power of a mobile node depletes not only due to quality of MANET routing, multipath routing
data transmission but also because of interference
from the neighboring nodes. Thus, a node looses its has attracted more and more research
energy at a specific rate even if it is not transferring attentions.
any data packet. Hence the lifetime of a network
largely depends on the energy levels of its nodes.
Higher the energy level, higher is the link stability
and hence, network lifetime. Also lower is the
routing cost.
First we find the Hamming distance by using the Here all the values are same. From this P1 and P2
path matrix. In hamming distance matrix the are already selected. So the remaining paths are P3,
number of paths is taken as rows and columns i.e., P4, P5 and P6. By default we select third path P3.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 553
References:
[1]. Osamah S. Badarneh andMichel Kadoch Multicast
Routing Protocols in Mobile Ad Hoc Networks: A
Comparative Survey and Taxonomy Hindawi
Publishing Corporation EURASIP Journal on Wireless
Communications and Networking Volume 2009, Article
Here the remaining paths are P4 and P6. Both are ID 764047, 42 pages
having the same value i.e., 6. By default we select
fifth path is P4. The remaining path P6 is sixth [2]. Dr. Shuchita Upadhyaya and Charu Gandhi Node
path. Disjoint Multipath Routing Considering Link and Node
The sequences of multiple paths are P2, P1, P3, P5, Stability protocol: A characteristic Evaluation
P4 and P6. International Journal of Computer Science Issues, Vol. 7,
6. Algorithm Issue 1, No. 2, January 2010.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 554
2
MGCGV, Chitrakoot
Satna (M.P.) India
Abstract Introduction
In Wireless Local Area Network data transfer from Wi-fi technology has played a very significant role in
one node to another node via air in the form of radio IT revolution and continues to do so. After 2 decades
waves. There is no physical medium for transferring it is very popular among the It fraternity. Many
the data like traditional LAN. Because of its companies, Educational Institutions, Airports as well
susceptible nature WLAN can open the door for the as domestic users make use of the WLAN facility.
intruders and attackers that can come from any Security is an important factor of Wireless Local
direction. Security is the most important element in
Area Network because of its nature. D-Link, Linksys
WLAN. MAC address filtering is one of the security
are providing the WLAN security with the help of
methods for securing the WLAN. But it is also
MAC address.[1] and WEP key. It is noted that the
vulnerable. In this paper we will demonstrate how
MAC address filtering is the gateway for hackers to
hackers exploit the WLAN vulnerability (Identity
enter and access the facility of Wireless Local Area
theft of legitimate user) to access the Wireless Local
Network.
Area Network.
Material and methods
Keywords: - WLAN, MAC address, Access Point,
The research was carried out to reveal WLAN
WNIC, Wi-Fi
Security: Active Attack on WLAN Secure Network
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 556
(Identity theft). The work was conducted at
Department of Information Technology, Jagran
Institute of Management. Materials used and the
procedures employed are as follows: We can design a
scenario after understanding the theory of WLAN
security with the help of MAC address filtering. We
have taken the Colasoft MAC Scanner 2.2 Pro Demo.
There are hardware such as: HCL Desktop, Toshiba
Laptops, AP (D-Link 2100 Series Access Point) and
Wireless card (D-Link DWA 510).
Click on advance tab, Click on filter, write the MAC
Cracker System equipped
with Colasoft MAC address of legitimate user. For searching the MAC
Scanner 2.2 Pro Demo
address click on start, click on run, type cmd and
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 557
again type getmac, this command will display the
MAC address of the WNIC
After that the system is not connected the wireless
LAN.
C:\>getmac
1C-AF-F7-0C-CC-8C \Device\Tcpip_{12361AAF- Here you can see the highlighted MAC address. This
5538-4489-87B4-C9BB984E1299} is the identity of authorized user namely FAHAD.
Now we are trying to connect the Now we change the identity with the help of
Target_Access_Point wireless network
following process:
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 558
Click on start, Go to control panel, double click on
Network connection, right click on Wireless Network
connection, click on configure, click on advance
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011
ISSN(Online):16940814
www.IJCSI.org 559
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 560
JAYA VERMA
STUDENT M.TECH IT (2ND YEAR) SUDEEPA ROY
G.G.S.I.P.U., Delhi. STUDENT M.TECH CS (1ST YEAR)
AMITY UNIVERSITY,NOIDA
Abstract The power of the WWW comes not simply from Mining [9] uses statistical analysis and inference to extract
static HTML pages - which can be very attractive, but the interesting trends and events, create useful reports, support
important first step into the WWW is especially the ability to decision making etc. It exploits the massive amounts of data to
support those pages with powerful software, especially when achieve business, operational or scientific goals. However
interfacing to databases. The combination of attractive screen based on the following observations the web also poses great
displays, exceptionally easy to use controls and navigational aids,
and powerful underlying software, has opened up the potential
challenges for effective resource and knowledge discovery.
for people everywhere to tap into the vast global information The web seems to be too huge for effective data
resources of the Internet [1]. There is a lot of data on the Web, warehousing and data mining. The size of the web is
some in databases, and some in files or other data sources. The in the order of hundreds of terabytes and is still
databases may be semi structured or they may be relational, growing rapidly. many organizations and societies
object, or multimedia databases. These databases have to be place most of their public-accessible information on
mined so that useful information is extracted. the web. It is barely possible to set up data
While we could use many of the data mining techniques to mine warehouse to replicate, store, or integrate of the data
the Web databases, the challenge is to locate the databases on the
on the web.
Web. Furthermore, the databases may not be in the format that
we need for mining the data. We may need mediators to mediate The complexity of web pages is greater than that of
between the data miners and the databases on the Web. This any traditional text document collection. Web pages
paper presents the important concepts of the databases on the lack a unifying structure. they contain far more
Web and how these databases have to be mined to extract authoring style and content variations than any set of
patterns and trends. books or other traditional text based documents. The
web is considered a huge digital library; however the
Keywords - Data Mining, Web Usage Mining, Document Object tremendous number of documents in this library is
Model, KDD dataset
not arranged according to any particular sorted order.
There is no index by category, nor by title, author,
I. INTRODUCTION cover page, table of contents and so on.
Data mining slowly evolves from simple discovery of frequent The web is a highly dynamic information source. Not
patterns and regularities in large data sets toward interactive, only does the web grow rapidly, but its information is
user-oriented, on-demand decision supporting. Since data to also constantly updated. News, stock markets,
be mined is usually located in a database, there is a promising weather, airports, shopping, company advertisements
idea of integrating data mining methods into Database and numerous other web pages are updated regularly
Management Systems (DBMS) [6]. Data mining is the process on the web.
of posing queries and extracting patterns, often previously The web serves a broad diversity of user
unknown from large quantities of data using pattern matching communities. The internet currently connects more
or other reasoning techniques. than 100 million workstations, and its user
community is still rapidly expanding. Most users may
II. CHALLENGES FOR KNOWLEDGE DISCOVERY not have good knowledge of the structure of the
Data mining, also referred to as database mining or knowledge information network and may not be aware of the
discovery in databases (KDD) is a research area that aims at heavy cost of a particular search.
the discovery of useful information from large datasets. Data
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 561
Only a small portion of the information on the web is web sites: games, educational and others. References to several
truly relevant or useful. It is said that 99 % of the web sites relating to these categories are stored in a database.
web information is useless to 99 % of web users. When extracting information, first the category is selected
Although this may not seem obvious, it is true that a and then a search is performed within the web sites referred in
particular person is generally interested in only a tiny this category. A query language enables you to query the
portion of the web, while rest of the web contains database consisting of information about various categories at
information that is uninteresting to the user and may several levels of abstraction. As a result of the query, the
swamp desired such results. system using this model for web content mining may have to
request web pages from the web that matches the query. The
concept of artificial intelligence is highly used to build and
These challenges have promoted search into efficient and manage the knowledgebase consisting of information on
effective discovery and use of resources on the internet. There various classes of web sites. The second approach is known as
are many index based Web search engines. These search the agent based model. This approach also applies the artificial
web, index web pages, and build and store huge keyword- intelligence systems, known as web agents that can perform a
based indices that help locate sets of web pages containing search on behalf of a particular user for discovering and
certain keywords [7]. However a simple keyword based search organizing documents in the web.
engine suffers from several deficiencies. First, a topic of any
breadth can easily contain hundreds of thousands of IV. WEB USAGE MINING
documents. This can lead to a huge number of document
entries returned by a search engine, many of which are only The concept of web mining that helps automatically
marginally relevant to the topic or may contain materials of discovering user access patterns. For example, there are four
poor quality. Second, many documents that are highly relevant products of a company sold through the web site of a company.
to a topic may not contain keywords defining them. This is Web usage mining analyses the behavior of the customers [8].
referred to as the polysemy problem. For example, the keyword This means by using a web usage mining tool the nature of the
Oracle may refer to the oracle programming language, or an customers that is which product is most popular ,which is less,
island in Mauritius or brewed coffee. So a search based on the which city has the maximum number of customers and so on.
keyword, search engine may not find even the most popular
web search engines like Google, Yahoo!, AltaVista if these V. WEB STRUCTURE MINING
services do not claim to be search engines on their web pages. Denotes analysis of the link structure of the web.web
So a keyword-based web search engine is not sufficient for structure mining is used for identifying more preferable
the web discovery, then Web mining should have to be documents. For example, the document A in web site X has a
implemented in it. Compared with keyword-based Web search, link to the document B in the web site Y [11]. According to
Web mining is more challenging task that searches for web Web structure mining concept, document B is important to the
structures, ranks the importance of web contents, discovers the web site A, and contains valuable information. The hyperlink
regularity and dynamics of web contents, and mines Web induced Topic search (HITS) is a common algorithm for
access patterns. However, Web mining can be used knowledge discovery in the web.
substantially enhances the power of documents, and resolve
many ambiguities and subtleties raised in keyword-based web VI. MINING THE WEB PAGE LAYOUT
search. Web mining tasks can be classified into three STRUCTURE.
categories:
Compared with traditional plain text, a web page has more
Web content mining structure. Web pages are also regarded as semi-structured data.
Web structure mining The basic structure of a web page is its DOM[3](Document
Web usage mining. object model) structure. The DOM structure of a web page is a
tree structure where every HTML tag in the page corresponds
to a node in the DOM tree. The web page can be segmented by
III. WEB CONTENT MINING some predefined structural tags. Useful tags
The concept of web content mining is far wider than include<P>(paragraph),<TABLE>(table),<UL>(list),<H1>~<
searching for any specific term or only keyword extraction or H6>(heading) etc. Thus the DOM structure can be used to
some simple statics of words and phrases in documents. For facilitate information extraction. Figure 1 illustrates HTML
example a tool that performs web content mining can DOM Tree Example [2]:
summarize a web page so that to avoid the complete reading of
a document and save time and energy. Basically there are two
models to implement web content mining. The first model is
known as local knowledgebase model. According to this
model, the abstract characterizations of several web pages are
stored locally. Details of these characterizations vary on
different systems [8]. For example, there are three categories of
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 562
separators denote the vertical or horizontal lines in a web page
that visually cross with no blocks. Based on the separators, the
semantic tree of the web page is constructed. A web page can
be represented as a set of blocks(leaf nodes of the semantic
tree)Compared with the DOM- based methods, the segments
obtained by VIPS are more semantically aggregated.
where f is a function that assigns to every block b in page p an [2] HTML DOM Tutorial
importance value. Specifically, the bigger fp(b) is, the more http://www.w3schools.com/htmldom/default.asp
important the block b is. Function f is empirically defined [3] http://www.cs.cornell.edu/home/kleinber/ieee99-web.pdf
below, [4] Traversing an HTML table with DOM interfaces
the size of block b
https://developer.mozilla.org/en/traversing_an_html_table
_with_javascript_and_dom_interfaces
fp(b)= x [5] Web Usage Mining
the distance between the center of b and the center of the screen
http://maya.cs.depaul.edu/~mobasher/papers/webminer-
(1.8) kais.pdf
where is a normalization factor to make the sum of fp(b) to be [6] Data Mining Within DBMS Functionality by Maciej
1, that is, Zakrzewicz, Poznan University.
fp(b) = 1 [7] Data Mining Concepts and Techniques By Jiawei Han
and Micheline Kamber.
bp
http://www.cs.uiuc.edu/~hanj/bk2/
Note that fp(b) can also be viewed as a probability that the user [8] Data Mining by Yashwant Kanetkar.
is focused on the block b when viewing the page p. Some more [9] Databases on web
sophisticated definitions of f can be formulated by considering www.ism-ournal.com/ITToday/Mining_Databases.pdf
the background color, fonts, and so on. Also, f can be learned [10] Seamless Integration of DM with DBMS and
Applicationsby Hongjun Lu
from some relabeled data (the importance value of the blocks
[11] Mining the World Wide Web - Methods, Applications,
can be defined by people) as a regression problem by using and Perspectives.
learning algorithms, such as support vector machines and [12] Wiki links
neural networks. Based on the block-to-page and page-to-block http://en.wikipedia.org/wiki/Web_mining
relations, a new Web page graph that incorporates the block
importance information can be defined as
WP = XZ, (1.9)
where X is a k x n page-to-block matrix, and Z is a n x k block-
to-page matrix. Thus WP is a k x k page-to-page matrix.
VIII. CONCLUSION
This paper has presented the details of tasks that are necessary
for performing Web Usage Mining, the application of data
mining and knowledge discovery techniques to WWW server
access logs [5].The World Wide Web serves as a huge, widely
distributed, global information service center for news,
advertisements, consumer information, financial management,
education, government, e-commerce, and many other services.
It also contains a rich and dynamic collection of hyperlink
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 565
Abstract
In the IEEE 802.11MAC layer protocol, the basic access method analytic performance of IEEE 802.11 in non-saturation
is the Distributed Coordination Function which is based on the mode.
CSMA/CA. In this paper, we investigate the performance of IEEE
802.11 DCF in the non-saturation condition. We assume that there
is a fixed number n of competing stations and packet arrival 2. Overview of Medium Access Layer
process to a station is a poisson process. We model IEEE 802.11
DCF in non-saturation mode by 3-dimensional Markov chain and Nowadays, the IEEE 802.11 WLAN technology offers the
derive the stationary distribution of the Markov chain by applying largest deployed wireless access to the Internet. This
matrix analytic method. We obtain the probability generating technology specifies both the Medium Access Control
function of packet service time and access delay, and throughput. (MAC) and the Physical Layers (PHY) [1]. The PHY layer
Keywords: DCF, Access delay, throughput. selects the correct modulation scheme given the channel
1. Introduction conditions and provides the necessary bandwidth, whereas
the MAC layer decides in a distributed manner on how the
offered bandwidth is shared among all stations (STAs).
Recent years Wireless Local Area Networks have brought
This standard allows the same MAC layer to operate on top
much interest to the telecommunication systems. IEEE
of one of several PHY layers.
802.11 standards define a medium access control protocols.
IEEE 802.11 MAC includes the mandatory contention-
Different analytical models and simulation studies have
based DCF (Distributed Coordination Function) and the
been elaborated the last years to evaluate the 802.11 MAC
optional polling-based PCF (Point Coordination
layer performance. These studies mainly aim at computing
Function)[1]. Most of todays WLANs devices employ
the saturation throughput of the MAC layer and focus on its
only the DCF because of its simplicity and efficiency for
improvement. One of the most promising models has been
the data transmission process. The DCF employs
the so-called Bianchi model [2]. It provides closed form
CSMA/CA (Carrier-Sense Multiple Access with Collision
expressions for the saturation throughput and for the
Avoidance) protocol with binary exponential backoff. The
probability that a packet transmission fails due to collision.
DCF is relatively simple while it enables quick and cheap
The modeling of the 802.11 MAC layer is an important
implementation, which is important for the wide
issue for the evolution of this technology. One of the major
penetration of a new technology.
shortcomings in existing models is that the PHY layer
conditions are not considered. The existing models for
We may classify arrival pattern of packets to the station
802.11 assume that all STAs have the same physical
into two modes: saturation mode and non-saturation mode.
conditions at the receiving STA (same power, same
Saturation mode means that stations always have
coding,: : :), so when two or more STAs emit a packet in
the same slot time, all their packets are lost, which may not
packets to transmit. Non-saturation mode means that
be the case in reality when for instance one STA is close to
stations have sometimes no packets to transmit. Most of
the receiving STA and the other STAs far from it [3]. This
analytical models proposed so far for the IEEE 802.11 DCF
behavior, called the capture effect, can be analyzed by
focus on saturation performance. Unfortunately, the
considering the spatial positions of the STAs. In [4] the
saturation assumption is unlikely to be valid in most real
spatial positions of STAs are considered for the purpose of
IEEE 802.11 networks. We note that most works ignore the
computing the capacity of wireless networks, but only an
effect of the queue at the MAC layer. There have not been
ideal model for the MAC layer issued from the information
many analytic works in the non-saturation mode due to
theory is used. The main contribution of this paper is
mainly analytic complexity of models. The necessities of
considering both PHY and MAC layer protocols to analyze
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 566
the performance of exciting IEEE 802.11 standard. Our off timer reaches zero, the source transmits the data packet.
work reuses the model for 802.11 MAC layer from [6], and The ACK is transmitted by the receiver immediately after a
extends it to consider interference from other STAs. We period of time called SIFS (Short Inter Frame Space)
compute, for a given topology, the throughput of any which is less than DIFS. When a data packet is transmitted,
wireless STA using the 802.11 MAC protocol with a all other stations hearing this transmission adjust their
specific PHY layer protocol. Without losing the generality Network Allocation Vector (NAV), which is used for
of the approach, we only consider in this paper traffic flows virtual CS at the MAC layer. In optional RTS/CTS access
sent from the mobile STAs in direction to the AP. The case method, an RTS frame should be transmitted by the source
of bidirectional traffic is a straight forward extension; we and the destination should accept the data transmission by
omit it to ease the exposition of our contribution. Further, sending a CTS frame prior to the transmission of actual
we assume that all STAs use the Distributed Coordination data packet. Note that STAs in the senders range that hear
Function (DCF) of 802.11 and they always have packets to the RTS packet update their NAVs and defer their
send (case of saturated sources). We present an evaluation transmissions for the duration specified by the RTS. Nodes
of our approach for 802.11b with data rates equal to 1 and 2 that overhear the CTS packet update their NAVs and
Mbps and the results indicate that it leads to very accurate refrain from transmitting. This way, the transmission of
results. data packet and its corresponding ACK can proceed
without interference from other nodes (hidden nodes
3. Importance Of Distributed problem).
Coordination Function (DCF) Table 1 shows the main characteristics of the IEEE
802.11a/b/g physical layers. 802.11b radios transmit at
Two forms of MAC layer have been defined in IEEE 2:4GHz and send data up to 11 Mbps using Direct
802.11 standard specification named, Distributed Sequence Spread Spectrum (DSSS) modulation; whereas
Coordination Function (DCF) and Point Coordination 802.11a radios transmit at 5GHz and send data up to 54
Function (PCF). The DCF protocol uses Carrier Sense Mbps using Orthogonal Frequency Division Multiplexing
Multiple Access with Collision Avoidance (CSMA/CA) (OFDM) [1]. The IEEE 802.11g standard [1], extends the
mechanism and is mandatory, while PCF is defined as an data rate of the IEEE 802.11b to 54 Mbps in an upgraded
option to support time-bounded delivery of data frames. PHY layer named extended rate PHY layer (ERP).
The DCF protocol in IEEE 802.11 standard defines how the
medium is shared among stations. DCF which is based on
CSMA/CA, consists of a basic access method and an
optional channel access method with request-to-send (RTS)
and clear-to-send (CTS) exchanged as shown in Fig. 1.
b b
PbBPSK = Q 2. =Q 2. (2)
o o
and Start Frame Delimiter (SDF). The PLCP Header and for QPSK (4-QAM) is:
contains the following fields: Signal, Service, Length, and
CRC. The short PLCP preamble and header may be used to b 1 2 b
minimize overhead and thus maximize the network data PbQPSK = Q 2. - Q 2. (3)
o 2 o
throughput. Note that the short PLCP header uses the 2
Mbps with DQPSK modulation and a transmitter using the
short PLCP only can interoperate with the receivers which
are capable of receiving this short PLCP format. In this
paper we suppose that all stations use the long PPDU 4. Conclusion
format in 802.11b. We evaluate our model in 802.11b
where STAs use transmission rate equal to 1 and 2 Mbps. There have been various attempts to model and analyze the
Our model can be employed for all other transmission saturation throughput and delay of the IEEE 802.11 DCF
modes for all standards if the packet error rate is calculated. protocol since the standards have been proposed. As
explained in the introduction there is different analytical
models and simulation studies that analyze the performance
of 802.11 MAC layer. As an example Foh and Zuckerman
present the analysis of the mean packet delay at different
throughputs for IEEE 802.11 MAC. Kim and Hou analyze
the protocol capacity of IEEE 802.11MAC with the
assumption that the number of active stations having
packets ready for transmission is large. They have
suggested some extensions to the model proposed to
evaluate packet delay, the packet drop probability and the
packet drop time. Since in our model we have used the
In this paper, we assume that the noise over the wireless Bianchis model and its extension proposed.
channel is white Gaussian with spectral density equal to
N0=2. In our model we define N0 as the power of the
thermal noise,
Authors profile
Bhanu Prakash Battula received Masters Engineering degree on
Computer Science & Technology in 2008 from Acharya Nagarjuna
University and also received another Masters degree on Computer
Applications from Acharya Nagarjuna University. After Post
graduation, He is working as a Asst.Professor in the Department of
Computer Science and Engineering at Vignans Nirula Institute of
Technology and Science, Guntur, Andhra Pradesh. He published
papers for International Journals. His research interests include
Computer Security, Steganalysis and Image Processing.
.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 569
*
Department of Computer Science,
University of Petroleum and Energy Studies, Dehradun, India.
Abstract The paper highlight intelligent Urban Traffic control List and Cetin [7] proposed different colour scheme to each
using Neuro-Genetic Petrinet. The combination of genetic vehicle entering the system, they modelled it by defining
algorithm provides dynamic change of weight for faster learning appropriate subnets modeling links at the intersections.
and converging of Neuro-Petrinet.
Keywords Neuro Petrinet, Urban Traffic Systems, Genetic III. BASICS OF PETRINET MODEL APPLICATION IN
Algorithm. URBAN TRAFFIC MODELING
I. INTRODUCTION To start with , we described a simple pattern of PN using
event relationship diagram. It shows that event e1 can cause
The previous models like developed for vehicular studies
event e2 within a time period [I1, I2] where T represents
only considered a limited macro mobility, involving restricted
Transition.
vehicle movements, while little or no attention was paid to
micro - mobility and its interaction. The research community
could not provide the realistic environment[6] for modeling
Urban Traffic which could simulate close to real time
situations. Our papers extend the concept of Li, M and Change
works of August oriented urban Traffic simulation using
interaction agent in controlling and management of urban Figure 1 : Simple Petrinet Representation
traffic systems. We use the concept of Neuro Genetic
Networks on self organizing Petrinet to simulate the traffic
condition. A. Dynamics of Producer
Consumer Petrinet with the algorithm for Dynamic Producer-
II. LITERATURE SURVEY Consumer given as:
The dynamics of Urban traffic System[4] was observed by Step 1 : Initialise each of the Producers- Consumer situation
Tzes, Kim and Mc Shane[8] which explains about the timing (x). set the pattern rate as 'r'.
plans of the traffic controlling junctions. While an example Step 2 : Set the control centre such that :
of coloured petrinet modeling of traffic light was proposed Xi : = Si :
by Jenson [5]. Later on Darbari[2] and Medhavi also Step 3 : Let the Token release rate be given as 1/N, where N is
developed Traffic light control by Petrinet. defined as the number of producer - consumer initial states.
Step 4 : The release of Token are updated as :
x : (producer - old) = x; (producer - new state) + r
A. Petrinet
Step 5 : Stop when system has transferred all the tokens and
Most recently List and Cetin [7] discussed the use of PNs in traffic reaches a balancing state.
modeling traffic signal controls and perform a structural Assuming the initializing condition to be Xi and after
analysis of the control PN model by P-invariants, successive training it reaches to 9. The stabilising condition is
demonstrating how such a model enforces traffic operations reached after 'n' iteration given as :
safety. {x; = t (x1.. xn} | I { 1..n}
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 570
V. CONCLUSIONS
Figure 2 : Process Representation of Petrinet with 0/1 The paper represents the dynamic control strategy of Urban
Learning rate in N-Dimensions. Traffic System by combining Neuro-genetic approach on
Petrints. The use of genetic learning method performs rule
We can define the movement of tokens and 0/1 learning rate discovery of larger system with rules fed into a conventional
by a single recursive equation as: system. The main idea to use genetic algorithms with neural
network is to use a genetic algorithm to search for the
Si = in (0) (X || out (0)) + in(1) (X || out (1)) (1) appropriate weight change in neural network which optimizes
the learning rate of the entire network.
The process graph of Neural Petrinet Framework represents A good Genetic Algorithm can significantly reduce neuro-
bisimilar relationship in Recursive mode. The Recursion can Petrinet in aligning with the traffic conditions, which other
be is achieved by using Genetic Algorithm[1] with learning wise is a very complex issue.
rate of modes [n1.nn] with a particular time frame.
Let the total function be defined as: REFERENCES
[1] Baker, B.M. (2003). A genetic algorithm for the Vehicle Routing
Problem", Computers and operations Research, Vol. 30.
nt = (nt-1, nt-2, nt-3, .. nt-M) (2) [2] Darbari, M , Medhavi, S, N-Dimensional Self Organizing Petrinet for
Urban Traffic Modeling IJCSI, Issue 4, No.2.
th [3] Deng, P.S., (2000), "Coupling Genetic Algorithm and Rule Based
Which predicts the current value of node n from past input
conditions. Systems for Complex Decisions", Expert Systems with Applications,
Vol. 19, No. 3.
The Nodes which will learn first will survive and based on [4] Grupe , F.H. (1998), "The Applications of Case - Based Reasoning to
them the traffic control network will converge. The Fitness[3] the Software Development Process," Information and Software
value (F.V.) is defined as : Technology, Vol. 40, No. 9.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 571
4
Arun K. Misra
3
Shefalika Ghosh Samaddar
1,3,4
The first, second and third Authors are thankful to Information
Security Education & Awareness Project (ISEA) of MCIT department of
Information Technology, Govt. of India for the partial support to the
research conducted. 1.1.1 Host Based Vulnerability Assessment
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 573
owners, and then activated simultaneously to launch
Vulnerability Assessment is to identify what systems are communications to the target machine of a magnitude as
alive within the network ranges for host based threats to overwhelm the target machine.
and what services they offer. Identifying the location of
the establishment and cataloging its services are the two
main elements of Vulnerability assessment Assessment
of vulnerability may lead to the deletion of a number of
viruses, worms and Trojan horses.
A Trojan horse program (also known as a back door Figure 2 DoS Attack
program) acts as a stealth server that allows intruders to
take control of a remote computer without the owners Ping of Death is another flavour (Figure-1, Figure-2) of
knowledge. Greek mythical Trojan horses are analogous DDoS. Smurf Attack involves using IP spoofing and the
in attributes which these digital Trojan horses posses. ICMP to saturate a target network with traffic. It is then
These programs typically masquerade as benign programs equivalent to launching a DoS attack. It consists of three
and rely on gullible users to install them. Computers that elements: the source site, the bounce site, and the target site.
have been taken over by a Trojan horse program are The attacker (the source site) sends a spoofed ping packet to
sometimes referred to as zombies. Armies of these the broadcast address of a large network (the bounce site).
zombies can be used to launch crippling attacks against This packet modified by the intruder contains the address of
Web sites. the target site. This causes the bounce site to broadcast the
misinformation to all of the devices on its local network. All
Communication based vulnerability are a real time threats of these devices now respond with a reply to the target
to computers security. Those may take the form of system, which is then saturated with those replies.
physical attacks, pilfered passwords, nosy network
neighbors and viruses, worms, and other hostile programs. Spam is another malicious formulation in the arena of
A number of manifestations of such vulnerability are seen cyber crime. Responses to spam may lead to huge financial
these days e.g. Denial of service (DoS) attacks. and material loss. Spam has the format of a e-mail message
A denial-of-service (DoS) attack hogs or that are pushed to e-mail clients without their solicitation.
overwhelms a systems resources so that it cannot respond
to service requests. A DoS attack can be effected by 2.0 Related Work
flooding a server with so many simultaneous connection
Vulnerability assessment process is comprised of four
requests that it cannot respond. Another approach would
phases, namely discovery, detection, exploitation, and
be to transfer huge files to a systems hard drive,
analysis/recommendations [2]. Figure 3 identifies the
exhausting all its storage space. A related attack is the
relationships among the four phases, and the flow of
distributed denial-of-service (DDoS) [ 1 ].
information into the final report.
The Security Threat and the Response attack, is also an
attack on a networks resources. It is launched from a
large number of other host machines. Attack software is
installed on these host computers, unbeknownst to their
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 574
Flaws) is the other type of the attack. Malicious use of
the Domain Name Service (DNS) and Internet routing
protocols leads to DoS. Many DoS attacks exploit
inherent weaknesses in core Internet protocols. This
makes them practically impossible to prevent, since the
protocols are embedded in the underlying network
technology and adopted as standards worldwide. Today,
even the best countermeasure software can only provide a
limiting effect on the severity of an attack [ 7]. An ideal
solution to DoS will require changes in the security and
authentication of these protocols [6].
Figure-4 Capturing a HTTP based mail Password Wireshark is able to capture the username and password
of mail user in the same way it does for message websites
Figure 6 shows that the username is arun and password is like www.160by2.com or www.way2sms.com. Figure-6
cracker which is given next to secret key. This is also shows the capturing of a message packet being sent from
shown in packet bytes pane in the right hand side of HEX the message website www.160by2.com as shown in
numbers (Figure 4). Sometimes, when the password of a figure-6. This figure shows that the user whose IP address
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 576
is 172.31.132.59 when logins the message website, the GUI. The default tshark output is shown below in figure-
packet is sent to the destination IP address 172.31.100.29 9.
which capture the HTTP packet and the corresponding
information to this is given in info POST as
http://www.160by2.com/logincheck.
Figure- 19Ping command to check system is alive or not Figure-21 Shell script preventing DoS
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011 584
ISSN(Online):16940814
www.IJCSI.org
increased from 1Hz to 4Hz , the P-100 latency VEP is not influenced by the direction of pattern
increases by 4.8 m sec. At a faster rate, the shift.
waveforms become less distinct and stimulation
above 8-10 Hz results in a steady state VEP. The 2.Methods of Visual Evoked Potential
Fig 1.Basic VEP different parts
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011 585
ISSN(Online):16940814
www.IJCSI.org
integrate the data obtained from different measurements. f(x) =jo x jo,k x dj x j,k x --------(5)
Medical imaging registration for data of the same patient
taken at different points in time often additionally involves Normally,we let jo=0 and select M to be a power of 2.So
elastic registration to cope with elastic deformations of the that summations in equations (3) through (5) are performed
body parts imaged. The original image is often referred to over n=0,1,2,.,M-1,j=0,1,2.j-1 and
as the reference image and the image to be mapped onto k=0,1,2,..2 -1. The W (jo,k) and W(j,k) in
reference image is referred to as the target image. Image equations (3) to (5) correspond to the Cjo and dj of
similarity based are broadly used in medical imaging. A the wavelet series expansion.Note that the integration in the
basic image similarity based method consist of a series expansion have been replaced by summations and a
transformation modal, which is applied to reference image 1/m normalizing factor,reminiscent of the DFT.
coordinates to locate their corresponding coordinates in the
target image space, an image similarity metric, which Using the equations (3) through (5),consider the discrete
quantifies the degree of correspondence between features in functions of four points in VEP study,ie,f(0),f(1),f(2) and
both image spaces achieved by a given transformation and f(3).Where f(0) is the checker board pattern reversal,f(1) is
an optimization algorithm which tries to maximize image the checker board flash, f(2) is the LED Gogges pattern
similarity by changing the transformation parameters. reversal and f(3) is LED Goggles flash stimulation. These
four points to be considered as f(0)=1,f(1)=4,f(2)=-3 and
The choice of an image similarity measure depends on the f(3)=0.Because m=4,j=2 and with jo=0 and summations are
nature of the images to be registered. Common examples of performed over x=0,1,2,3,j=0,1 and k=0 for j=0 or k=0,1
image similarity measures include cross-correlation, for j=1.
Mutual information, Mean square difference and ratio
image uniformity. Mutual information and its variant We will use the Hear scaling and wave let functions and
,normalized registration of multimodality images. Cross- assume that the four samples of f(x) are distributed over the
correlation ,mean square difference and ratio image support of the basic function,which is in width,Subtituting
uniformity are commonly used for registration of images of the four samples into equations (3),we find that
same modality.
W(0,0)=1/2 f(n) o,o
4.Discrete wavelet transform (DWT)
=1/2[107.5 m sec +113.1 m sec -113.1 m
If the function being expanded is discrete (ie, a sec+116.9 m sec]
sequence of coefficents.If the function being expanded is
=1/2[224.8 m sec]
discrete ie,a sequence of numbers) the resulting coefficents
are called the discrete wavelet transform (DWT).
= 112.4 m sec
If f(n) =f(Xo +Nx) for some Xo,X and
Because o,0(n)=1 for n=0,1,2,3 note that we have
n=0,1,2,3..M1,the wavelet series expansion coefficents
employed uniformely spaced samples of the Hear
for f(x) defined by
transmission matrix. Therefore the P-100 latency of four
point stimulations of DWT are uniformely spaced samples
Cjo(k) =<f(x),jo,k(x)>= f(x)jo,k(x)dx and---------(1)
of the scaling and wave let functions are used in the
Dj(k)=<f(x),j,k x >=f(x)j,k dx-------(2) computation of the inverse.
Become the forward DWT coefficents for sequence f(n): The four point DWT in the VEP P-100 latency
measurement of a two scale decomposition of
w(jo,k)=1/m f(n)jo;k---------(3) f(x),ie,j={0,1}.To underlying assumption was that starting
scale Jo was zero but other starting scales are possible.
w(j,k)=1/mf(n) j,k for jjo-------(4)
5.Experimental procedure
The (0),k and j,k in the equations are sampled
versions of basic functions jo,kx and j,kx. The VEP recording were performed in a dark and sound
attenuated room in a laboratory. Subject was asked to sit
If jo,k=jo,k( xs+xs)for some xs,equally spaced comfortably in front of the checker board pattern at an eye
samples over the support of the basic functions. In screen distance of 100 cm. The preferred stimulus for
accordance with equations clinical investigation of the visual pathways is a reversal of
a black and white checker board pattern, as it tends to
evoke larger and clear responses than other patterns. The
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011 586
ISSN(Online):16940814
www.IJCSI.org
stimulus pattern was a black and white checkerboard was synchronized using software program. The starting
displayed on a computer screen. The checks alternate from point of VEP waveform is stimulus onset. The VEP
black/white and white/black at a rate approximately of waveform recording is done over a period of 250 m sec.
twice per second. The subject was instructed to gaze at a More than 100 epochs were averaged to ensure a clear VEP
colored dot on the center of the checkerboard pattern. waveform. For judging the reproducibility, the waveform is
Every time the pattern alternates, the patient visual system recorded twice and superimposed. A typical averaged
generates an electrical response and was recorded using various types of stimulations like
electrodes. Signal acquisition and stimulus presentation
Fig 4. LED Goggles(Pattern Reversal)
Fig 2. B/W Checker board (Pattern Reversal)
Fig 5. LED Goggles(Flash)
Fig 3. B/W Checker board-(flash)
different types of retinal images B/W checker board checkerboard stimulation,82.5 ms,116.9 ms and 155.6 ms
(pattern reversal),B/W checker board (flash),LED Goggles for LED Goggles pattern reversal and 80.0 ms,113.1 ms
(pattern reversal) and LED Goggles(flash) stimulations are and 151.9 ms noted for LED Goggles flash stimulation.
recorded with same subject with variability P-100 latencies
are amplitudes are noted in the following superimposed Table :1 Developed accuracy P-100 latency measurement
waveforms and results. The VEP signal has been labeled to
Montage Tr N75 ms) P100(ms) N145(ms)
indicate the N75,P100 and N145 marks, the corresponding
latencies for the subject being Oz-Fz 1 112.4
83.1 ms,107.5 ms and 175 ms for B/W pattern reversal Oz-Fz 2 112.4
checker board,77.5 ms,113.1 ms,151.9ms for B/W flash
IJCSIInternationalJournalofComputerScienceIssues,Vol.8,Issue3,No.1,May2011 587
ISSN(Online):16940814
www.IJCSI.org
Finally, all the potential transforms between images are 4.FitzGerald,M.J..T and Folan-Curran,J.Clinical
generated, with the correct registration producing accurate Neuroanatomy and Related Neuro science,W.I.FitzGerald
P-100 latency measurement of 112.4 m sec with different
5.Biomedical Instrumentation and Measurements-Leslie
types of retinal waves as shown in the above table (V).
Cromwell,Fred J.Weibell,Erich A.Pfeiffer-2nd edition.
6. Results and conclusion 6. Digital Image Processing and Analysis-
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 588
Lecturer, Computer Science Department, Kumaon University, Shriram Institute of Management & Technology
Kashipur (Udham Singh Nagar), Uttarakha, 244713, India
Associate professor, Computer Science & Engg. Department, HNB Garhwal University,
Srinagar (Garhwal), Uttarakhand-246174, India
80 NorthAmerica
3. WiMAX Technology Forecast 60
America
40
Wireline technologies are slow and costly to roll out - even in
AsiaPacific
some parts of developed nations. Cellular technology is often 20
too costly to use, does not deliver true broadband speed and does
not scale to the capacity of an all-IP media-centric network. 0 Africa/Middle
2007 2008 2009 2010 2011 2012 East
Therefore it is assumed that, throughout the forecast period,
particularly aggressive WiMAX growth [2] will take place in
countries such as Brazil, China, India and Russia; and in regions Fig 2: WiMAX Users by Region 2007-2012
such as the Americas, Middle East/Africa, Eastern Europe and
Developing Asia Pacific.
Table 2: WiMAX Users by Region (millions) 2007-2012
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 590
Users = subscribers adjusted to reflect multiple users per Europe is anticipated to have the largest number of
subscription operators, followed by Asia Pacific, Africa/Middle East,
Region 2007 2008 2009 2010 2011 2012 Americas and North America. However, Africa/Middle East
is expected to have the highest number of WiMAX operator
North 2.61 4.03 6.25 9.59 14.79 22.62
countries, followed by Europe, Americas, Asia Pacific and
America
North America.
Americas 0.66 1.18 2.14 3.92 7.17 12.97
Asia 1.39 2.84 5.99 12.96 28.17 60.45
Pacific
Europe 1.35 2.34 4.07 7.08 12.23 21.01
Africa/ 0.30 0.65 1.46 3.32 7.50 16.60
Middle
East
TOTAL 6.32 11.04 19.91 36.88 69.87 133.66
4. WiMAX: Country Growth & Operator Fig 4: Average WiMAX Users by Operator & Country 2007-2012
Competing technologies
The numbers of WiMAX operators and countries shown in
Figure 3 are those in which WiMAX service has commenced.
Those currently in deployment but not yet implemented in their
account for the forecasts, with other operators and status which
will adopt WiMAX technology in future.
5. Conclusion
The purpose of this forecast is to provide the WiMAX Forum
prediction of the ecosystems worldwide growth over the next
five years. The forecast covers WiMAX deployments globally
and is broken down by major regions North America, Asia-
Pacific, Europe, and Middle East/Africa. This also includes
major country or sub-regional breakouts for the USA (united
state of America), Canada, Japan, China, Korea, India, the Rest
of Asia-Pacific Developed, the Rest of Asia-Pacific Developing,
Western Europe, Eastern Europe, Africa and the Middle East.
Assumptions
Worldwide access to Broadband Internet is vital for economic
growth and development. All governments must work to ensure
that their nations are able to realize the benefits associated with
a strong communications infrastructure. Therefore this report
assumes that many countries will adopt WiMAX as a wireless
Broadband Internet technology to facilitate rapid economic
development. It is also assumed that the move to WiMAX, a
technology that is ready for deployment now, will be preferable
to waiting for alternative technologies that may not be available
for three or more years.
We can assume the growth of WiMAX technology, because we
have seen the results the other related technologies by rapid
growth, WiMAX user growth, the worldwide WiMAX operator
growth, average WiMAX user by operator and country 2007 to
2012 and finally other competing technology of Fig 1, Fig 2, Fig
3, Fig 4 and Fig 5 respectively .
So, we can conclude that WiMAXs operator and product has a
vital role for country and their operators growth to cost-
effective reach million of traditional voice and broadband data
services.
Reference
2
Electronics Department, Dr. RML Awadh University
Faizabad, Uttar pradesh, India
3
Electronics Department, Dr. RML Awadh University
Faizabad, Uttar pradesh, India
These devices have successfully been made in large arrays phase of the optical field, I is the injection current, q is the
and solder bonded to the circuits. Also, Multiple quantum electronic charge, is the carrier density in the quantum
well (MQW) modulators offer an advantage over other wells for the reference bias level, p is the power
light emitters in terms of signal and clock distribution. output .physical meaning and values of various other
Furthermore, the electrical signals can be sampled with coefficients can be found in ref [7] . Simulated Laser
short optical pulses to improve the performance of power output was then fed to the modelled integrated
receivers. MQWM based link requires that an external surface-normal reflective electroabsorption mqw
beam be brought onto the modulator. This facilitates to modulators. Quantum well absorption data for three
generate and control one master laser beam which allows quantum wells is taken from the literature for well width
centralized clocking of the entire system, and the use of of 95 , and the Al0.3Ga0.7As barrier thickness of 30 .
modulators, as described above, allows the retiming of An electroabsorption modulator using the quantum-
signals, especially if the master laser operates with confined Stark effect is formed by placing an absorbing
relatively short optical pulses. Thus QWM based quantum well region in the intrinsic layer of a pin diode.
approach, besides yielding lower transmitter on-chip Doing so creates the typical p-i-n photodiode structure and
power dissipation can be more conducive to monolithic enables large fields to be placed across the quantum wells
integration. This was the motivation for simulating a without inducing large currents. By applying a static
MQWM based optical interconnect link. reverse bias across the diode, photogenerated carriers are
efficiently swept out of the intrinsic region and the device
acts as a photodetector. Varying this bias causes a
3. Modeling and Simulation methodology
modulation in the optical absorption, resulting in an
In this section we describe the methodology used for optical modulator. The modulator is characterized by its
modelling and simulation of optical interconnect capacitance, Insertion Loss and Contrast Ratio. An ideal
transmitter. modulator has minimum optical power loss during the
The simulated laser diode is an InGaAsAl-GaAsGaAs "on" state (IL), and largest possible optical power ratio
quantum-well separate confinement heterostructure. We between the "on" and the "off' states (CR). Typically, there
considered only the internal parasitics assuming a low- is a trade-off between these parameters for a given value
parasitics assembly scheme. The simulated modulator of the ratio between maximum (max) and minimum
structure is reflective mode (RMQWM). (min)absorption . The IL/CR relation for a simple
For simulation of the dynamic response of MQW laser a RMQW structure in a reverse biased PIN configuration is
rate equation model has been used [7]. In this model we given below
have not included the effect of carrier dynamics in the
quantum wells yielding the following set of equations
(5)
Here is a dimensionless efficiency factor is output response of MQWM modulator is shown in fig 4.
Simulated optical photon density output of MQWM
the modulator responsively is the input laser power to Modulator with ramp input and bias current=2mA is
the modulator, the pre-bias voltage and is the shown in fig 5. Minimun interconnect power is observed
supply voltage small compared to the static power of the as a function of bit rate. We further study the change in
modulator. the minimum interconnect power as a function of
parameter X, which is dictated by bias current. It was
observed that response of model worsens with increase in
4. Model description and Results bias current. We have not included the effect of pattern
jitters and crosstalk. All the simulations were run over a
Simulation was carried out in two stages. In the first stage time period that was several orders of magnitude longer
the rate equations were implemented in simulink as shown than the fixed step size chosen so that turn-on transient
in fig -1. Laser power output was then coupled to external effects that happen near threshold can be avoided. All
modulator. Simulink model of MQWM modulator is simulations were carried out using standard 4th-order
shown in fig 2. Simulated Laser diode photon density for Runge-Kutta algorithm with a fixed step size.
1ns pulse is shown in Figure 3. The simulated power
Acknowledgments
References
[1] David A. B. Miller Rationale and Challenges for Optical
Interconnects to Electronic Chips PROCEEDINGS OF THE
IEEE, VOL. 88, NO. 6, JUNE 2000.
[2] Daniel S. Chemla,David A B Milar,Peter W Smith Room
Temperature Excitonic Non linear absorption and Referaction in
GaAs/AlGaAs Multiple Quantum Well Structure IEEE
JOURNAL OF QUANTUM ELECTRONIC,VOL QE-
20,NO.3MARCH 1984.
[3] Samuel Palermo, Azita Emami-Neyestanak,, and Mark
Horowitz, A 90 nm CMOS 16 Gb/s Transceiver for Optical
Interconnects IEEE JOURNAL OF SOLID-STATE CIRCUITS,
VOL. 43, NO. 5, MAY 2008.
[4] Azad Naeemi,E, Reza Sarvari, and James D. Meindl
,Performance Comparison Between Carbon Nanotube and
Copper Interconnects for Gigascale Integration IEEE
Fig. 5 Simulated photon density of ELECTRON DEVICE LETTERS, VOL. 26, NO. 2,
MQWM Modulator w ith ramp FEBRUARY 2005.
input and bias current=2mA [5] Y. Liu et al, Numerical investigation of self-heating effects
of oxideconfined vertical-cavity surface-emitting lasers, IEEE J.
of Quantum Electron., Vol. 41, No. 1, pp. 15-25, Jan. 2005.
[6] O. Kibar, D. A. A. Blerkon, C. Fan, and S. C. Esener,
Power minimizationand technology comparison for digital free-
5. Conclusions space optoelectronic interconnects,J. Lightw. Tech., vol. 17, no.
4, pp. 546555, Apr. 1999.
The work describes a methodology to model, simulate and [7] A. Javro & S.M. Kang, transforming Tuckers Linearized
then optimize the MQWM based optical interconnect Laser Rate Equations to a Form that has a Single Solution
transmitter power output with respect to various eegime, Journal of Lightwave Technology, vol.13, No.9,
parameters namely contrast ratio, insertion loss and bias pp.1899-1904, September 1995.
current. The methodology presented here is suitable for [8] A. V. Krishnamoorthy and D. A. B. Miller, Scaling
investigation of both analog and digital modulation optoelectronic- VLSI circuits into the 21st century: A
technology roadmap, IEEE J. Select. Topics Quantum Electron.,
performance but it primarily deals with digital modulation.
vol. 2, pp. 5576, Apr. 1996.
The modulator was simulated on MATLAB Simulink tool [9] Hoyeol Cho, Pawan Kapur, and Krishna C. Saraswat Power
and model response was obtained for 1- 20Gbps bit rate. Comparison Between High-Speed Electrical and Optical
The simulated model can achieve error-free operation Interconnects for Interchip Communication JOURNAL OF
under 16 Gbps data rate. It was observed that Modulator LIGHTWAVE TECHNOLOGY, VOL. 22, NO. 9,
output worsens with increase in bias current. These results SEPTEMBER 2004.
are based on simplified cases excluding pattern jitters, [10] J. J. Morikuni, A. Dharchoudhury, Y. Leblebici, and S. M.
crosstalk and the effect of carrier charge density in Kang Improvements to the Standard Theory for Photoreceiver
multiple quantum well. However, the effect of pattern Noise JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL.
12, NO. 4, JULY 1994.
jitters and bandwidth limits of each device will become
[11] Kyung-Hoae Koo, Hoyeol Cho, Pawan Kapur, and Krishna
increasingly important as the density of an interconnect C. Saraswat, Performance Comparisons Between Carbon
array becomes higher. These are subjects for further study. Nanotubes, Optical, and Cu for Future High-Performance On-
The model can be further improved by addressing these Chip Interconnect Applications IEEE TRANSACTIONS ON
issues. ELECTRON DEVICES, VOL. 54, NO. 12, DECEMBER 2007
[12] C. L. Schow, J. D. Schaub, R. Li, J. Qi, and J. C. Campbell,
A 1-Gb/s Monolithically Integrated Silicon NMOS Optical
Receiver IEEE JOURNAL OF SELECTED TOPICS IN
QUANTUM ELECTRONICS, VOL. 4, NO. 6,
NOVEMBER/DECEMBER 1998.
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No. 1, May 2011
ISSN (Online): 1694-0814
www.IJCSI.org 597
Determ
mination n of the Complex
C x Dielecttric Permmittivity
y Industtrial
Matterials off the Ad
dhesive Products
P s for thee Modeliing of an
n
Electrommagnetiic Field at the L
Level of a Glue J Joint
Mahmoud
M as1, Mohammad
Abba d Ayache2
1
Depa
artment of Elecctronics, Leban
nese University
Saidda, Lebanon
2
Departmeent of Biomediccal, Islamic Univversity of Lebaanon
Khalde Highway,
H Lebannon
c
11. Introducttion (2)
'r
TThe dielectric heating concerrns dielectric body,
b the body y 2 (1 1 tgg 2 )
thhat is bad elecctrically condu uctive is generrally bad driveer 2
oof heat. In gen neral, such a body containss molecules or o
ggroups polar. These
T charges tend
t to align with
w the electriic depeends on the pphysical charaacteristics of tthe local
ffield within thee material. In th
he case where ana electric field environnment and the frequency. is the angle loosses and
aat low frequency is imposed, alignment caan occur with a it is givven by:
laag which is a loss of electtromagnetic en nergy and thu us
hheating of thee material. The T choice off the working g 'r'
ffrequency is regulated to t avoid in nterfaces with h tg , (3)
teelecommunicaations; some bands are released fo or ' r
inndustrial, sciientific and medical use (ISM). Th he
innteraction off electromagn netic waves and materialls
trransforms electromagnetic energy into thermal t energy y
wwhich is refleected in both the ionic co onductivity and
IJJCSI Internationall Journal of Compu
uter Science Issuess, Vol. 8, Issue 3, No.
N 1, May 2011
ISSSN (Online): 16994-0814
w
www.IJCSI.org 598
T
Fig 4: Evolution of
o the temperature through the joint glue
g and the BMC
C C p div(T ) P (6)
d
t
IJJCSI Internationall Journal of Compu
uter Science Issuess, Vol. 8, Issue 3, No.
N 1, May 2011
ISSSN (Online): 16994-0814
w
www.IJCSI.org 600
: Electrical co
onductivity
: bulk density
y of the materiaal
P
Pd : Power abso
orbed by the material
m (W.cm-3)
4. Con
nclusion
By studdying of the rresults produceed through meechanical
dimensiioning of thee applicator, w we conclude that the
couple modeling expeerimentation cconstitutes a soolid basis
fective to appreehend the probblems of bondiing under
and effe
microwwave. We also ccheck the stronng absorption oof energy
at the leevel of glue seeals (electric fiield attenuatedd) and the
F
Fig 5: Cavity with vacuum, mode TEE013, slice of the electric
e field in thee microwwave can pollymerize welll the adhesivves with
yoz plan
n in 3D. reducedd time and low w energy conssumption withoout fever
parts paasting.
These rresults of meassurement on thhe dielectric paarameters
will giive us numeerical data annd give the thermal
parametters using the M Maxwell's equuations and the equation
IJJCSI Internationall Journal of Compu
uter Science Issuess, Vol. 8, Issue 3, No.
N 1, May 2011
ISSSN (Online): 16994-0814
w
www.IJCSI.org 601
R
References
[1] G.B. GAJD DA, S.S. STUC CHLY, Numeriical Analysis of o
OOpen-Ended Coaaxial Lines, IEE EE Trans. On Microwave
M Theorry
aand Techniques, vol. MTT-31, N5,
N May 1983.
[22] MARIA A. STUCHLY,
S MICCHAEL M. BRA ADY, G.GAJDA A,
Equivalent circcuit of an open n-ended coaxial line in a lossy
ddielectric, IEEE
E Trans. Instrumm. Meas., vol. IM-31
I N2, Junne
11982.
[33] M.ABBAS, P.A. BERNAR RD, Cl. MARZ ZAT, Microwav ve
bbonding in continnuous devices foor reprocessing little
l of wood annd
cconverting themm into solid pieces. EUROCOA AT 1994, Sitgees
(BBarcelone), vol.1, pp. 99, 27-30 Septembre 1994 4.
[44] M.ABBAS, P.A. BERNA ARD, Cl. MA ARZAT, Collag ge
inndustriel par micro-ondes- dileectrique des collees en fonction de
d
laa frquence et de
d la tempraturee. Matriaux & Techniques, N N
110-11, pp.9, 19944.
[5] PATANKAR R S.V., Numericcal heat transferr and fluid flow w.
HHemispher. 1980 0 New York.
[6] M. ABBAS S, P.A. BERNA ARD, Cl. MAR RZAT,Microwav ve
bbonding in contiinuous devices for
f processing liittle of wood an nd
cconverting themm into solid piecces. Double liaison, physique et e
cchimie des peintuures et adhesivess- N0466-1994.
[7] M. ABB BAS, P.A. BERNARD,B Cl.
C MARZAT T,
BB.HAMDOUN, Modlisation electromagntiqu
e ue et thermiqu ue
ddun micro-ondees 2054 GHz afina doptimiserr la rpartition au a
nniveau du joint de colle. Matriaux et techniiques, N0 10-11 1-
112,PP.27,2003.
[8] M. ABBAS S, B.HAMDOU UN, Measuremeent of complex
ppermittivity of adhesive
a materiials using a sh
hort open- ended
ccoaxial line probe, journal of microwave and d optoelectronicss,
vvolume3, number6, October 2004 4.
[9] M. ABBAS S, J. CHARARA A, Thermal ch haracterization ofo
inndustrial adhesiv
ves used to gluee composite maaterials and wood
bby microwaves ata 2.45 GHz, Maatriaux et techn niques 94,PP.165 5-
1169,2006.
M
Mahmoud Abas ss received the Ph.D degree in Electronics from m
th
he University of Bordeaux, Francce, in 1995. His research
r interestts
in
nclude modeling g and optimizattion of microwa ave devices and
e
electronic circuits
s.
4
Departm
ment of Com
mputer Sciencce & Engineeering
Berhamp
pur Universiity,Berhamp
pur,Odisha, IIndia
Abstract
Keyw
words: wireeless sensor nnetworks,
The efficient
e nodde-energy utilization
u in
n pow
wer aware roouting protoccol, NS-2
wirelless sensorr networks has been n
studied because sensor nodes operatee 1. In
ntroduction n
with limited batttery power. To extend d Aw wireless senssor network is one of thhe
the lifetime off the wirelless sensorr ad hoc wirelless telecoommunicatioon
netwo orks, we red duced the node
n energyy netwworks, whichh are deployyed in a widde
consu umption off the overa all networkk areaa with tiny loow-poweredd smart sensoor
whilee maintainin ng all senso ors balancedd nodees. An esssential elem ment in this
node power use. Since a large numberr enviironment, this wireeless sensoor
of seensor nodess are denseely deployed d netwwork can bbe utilized in a variouus
and interoperatted in wireeless sensorr info rmation aand telecoommunicatioon
netwo ork, the liffetime exteension of a appllications. Thhe sensor noodes are smaall
sensoor network k is main ntained byy smaart devicces withh wirelesss
keepiing many sensor
s nodees alive. In n commmunication capabiliity, whicch
this paper, we submit po ower awaree colleects informaation from light, soundd,
routin ng protocool for wireeless sensorr tempperature, mootion, etc., pprocesses thhe
netwo orks to increease its lifettime withoutt senssed informaation and trransfers it tto
degraading netwo ork perform mance. Thee otheer nodes.
propo osed protoco ol is design ned to avoidd
traffiic congestio
on on speciffic nodes att A wwireless sennsor networkk is typicallly
data transfer an nd to mak ke the nodee madde of many sensor nodees for sensinng
poweer consumption widely distributed d accuuracy and scalability of ssensing areaas.
to inccrease the lifetime
l of the
th network.. In ssuch a largge scale off networkinng
The performan nce of thee proposed d enviironment, onne of the most importannt
proto
ocol has been exam mined and d netw
working facctors are seelf-organizinng
evalu
uated with the t NS-2 simulator in n capaability for well addaptation oof
termss of networrk lifetime and a end-to-- dynaamic situuation chhanges annd
end delay.
d interroperating ccapability beetween sensoor
IJCSI International Journal
J of Com
mputer Science Issues, Vol. 8,, Issue 3, No. 11, May 2011
ISSN (Online): 1694 4-0814
www.IJCSI.org 603
nodess [1]. Many studies havee shown thatt exhaaustion of node batter powers aas
there are a varieety of senso ors used forr trafffic congestiion occurs on specifi fic
gatheering sensiing inform mation and d nodees participatting in data ttransfer.
efficiiently transfeerring the infformation to
o
the siink nodes. In ssection 2 off this paper, we describbe
the wwell-knownn AODV rouuting protocool
The major issuees of such studies aree and show somee difficultiess in adaptinng
ocol design in regardss to battery
proto y the pprotocol for wireless sennsor networkk.
energgy efficiency y, localizatiion scheme,, In ssection 3, w we propose an efficiennt
synch hronization, and data aggregation n routting protocool, which cconsiders thhe
and security tecchnologies for f wirelesss nodee residual battery ppower whille
sensoor networkss. In particcular, many y exteending the llifetime of the networkk.
reseaarchers havee great inteerest in thee Secttion 4 discusses the NS-2 simulatioon
routinng protocolss in the nettwork layer,, perfformance annalysis of the routinng
which h consid
ders self-oorganization
n prottocols alongg with finall conclusionns
capab bilities, limiited batter power, and d and future studiees.
data aggregation
a schemes [2, 3].
2. R
Related Stud
dy and Prob
blems
A wireless senssor network k is densely y defiined
deplooyed with a large number of sensorr
nodess, each off which op perates with h The AODV (Ad hoc On-demannd
limiteed battery power, whiile working g Disttance Vectoor) protocol is an onn-
with the self-orgaanizing capaability in thee demmand routiing protocol, whicch
multii-hop environment. Sincce each nodee accoomplishes the routee discoverry
in thee network plays both teerminal nodee wheenever a daata transfer is requesteed
and routing
r nod
de roles, a node
n cannott betwween nodess. The AO ODV routinng
particcipate in thee network if its batteryy prottocol searchhes a new rooute only bby
poweer runs out. The increaase of such h requuest of sourrce nodes. W When a nodde
dead nodes gen nerates man ny network k requuests a routee to a destination node, it
partittions and consequenttly, normall initiiates a rooute discovvery processs
comm munication will
w be impossible as a amoong networkk nodes. Thee protocol caan
sensoor network.. Thus, an n importantt greaatly reduce tthe number of broadcastts
reseaarch issue is the developpment of an n requuested for rrouting searcch processes,
efficiient batter-p
power man nagement to o wheen comparred to the DSDV V
increase the lifee cycle of the
t wirelesss (Desstination Sequencedd Distancce
sensoor network [44]. Vec tors) routinng protocool, which is
know wn to discoover the opptimum routte
his paper, we proposed an efficientt
In th betwween sourcee and desttination witth
energ
gy aware rou uting protoccol, which iss pathh informattion of all nodes.
basedd upon thee on-deman nd ad hocc Addditionally, ssince each node in thhe
routin
ng protocol AODV [5, 6], which h DSD DV routingg protocol maintains a
determmines a proper path with h routting table - data, whhich includees
consiideration off node resid dual batteryy commplete route information - the AODV V
poweers. The pro oposed proto ocol aims too prottocol greaatly improoves som
me
exten
nd the lifetim
me of the ov verall sensorr drawwbacks of DSR (Dynnamic Sourcce
netwoork by avo oiding the unbalanced d
IJCSI International Journal
J of Com
mputer Science Issues, Vol. 8,, Issue 3, No. 11, May 2011
ISSN (Online): 1694 4-0814
www.IJCSI.org 604
Routiing) protoco ol such as th
he overhead
d of thhe routing ddiscovery annd determinees
incurrred at data trransfer. efficcient routes between noddes. Figure 22.
A rrouting estaablishing fllow betweeen
Oncee a route is discovered
d in
n the AODV V sourrce and destiination.
routinng protocol, the route will bee
mainttained in a taable until the route is no
o
longeer used. Eacch node in the AODV V
proto
ocol contain ns a sequen nce number,,
which h increasess by one when thee
locatiion of a neighbor
n nod de changes..
The number can n be used to o determinee
the reecent route at
a the routing g discovery.
Figurre-1 (fig--2)
3. Prroblem Form
mulation Figuure 3.1. A R
RREQ messaage format foor
our pproposed prrotocol
3.1. Proposed Routing
R Prottocol
To ffind a routee to a destinnation node, a
In th
his paper, we
w describee a routing
g sourrce node flooods a RRE EQ packet tto
proto
ocol, which considers a residuall the network. When neigghbor nodees
IJCSI International Journal
J of Com
mputer Science Issues, Vol. 8,, Issue 3, No. 11, May 2011
ISSN (Online): 1694 4-0814
www.IJCSI.org 606
receivve the RRE EQ packet, they updatee the largest vallue of . That is, thhe
the Min-RE
M valuue and rebrroadcast thee propposed protoocol collectss routes thaat
packeet to the nexxt nodes untiil the packett havee the minim mum residuual energy oof
arrivees at a deestination node. If thee nodees relativelyy large and hhave the least
interm
mediate node receivess a RREQ Q hop--count, and then determ mines a propeer
message, it increeases the ho op count by y routte among theem, which cconsumes thhe
one and
a replacess the value of the Min-- miniimum netwoork energy compared tto
RE fiield with thee minimum energy
e valuee any other routess.
of the route. In other
o words, Min-RE iss
the energy value of the nodee if Min-RE E
is grreater than its own en nergy value;;
otherrwise Min-RRE is unchang ged. Heree Min-RE iis the minim mum residuaal
enerrgy on the rroute and Noo-Hops is thhe
Altho ough intermeediate nodess have routee hop count of thhe route bettween sourcce
informmation to th
he destinationn node, theyy and destinationn. And k iss the weighht
keep forwarding the RREQ message to o coeffficients forr the hop count. Thhe
the destination because it has no o enerrgy consumpption of onne hop in thhe
informmation abouut residual ennergy of thee netw
work will bbe little, whhere one hoop
otherr nodes on the rou ute. If thee meaans a data traansfer from a node to thhe
destinnation node finally receiives the firstt nextt node. Thee weight coeefficient k is
RREQ Q message, it triggerrs the dataa usedd to adjust tthe differencce of Min-R
RE
collecction timer and receivees all RREQ Q and No-Hops inn simulation.
messages forward ded through other routess
until time expirees. After thee destinationn 3.4. The analyssis of routin
ng protocolss
node completees route information n
collecction, it deetermines ana optimum m To understand the operattions of thhe
route with use off a formula shown
s in 3.2
2 propposed protoocol, we coonsider threee
and then
t sends a RREP message to thee diffeerent rouuting prootocols foor
sourcce node by unicasting.
u If
I the sourcee operrational com
mparison:
node receives the t RREP message, a
route is establisheed and data transfer getss C
Case 1: Chhoose a rouute with thhe
starteed. Such route pro ocesses aree miniimum hop ccount betweeen source annd
perfoormed perio odically, thhough nodee desttination. (AO
ODV routingg protocol).
topollogy does not
n change to maintain n
node energy conssumption balanced. Thatt Caase 2: Choose a route with largest
is, thhe periodicc route disccovery willl miniimum residdual energyy. (Max_Miin
excluude the nod des having low
l residuall Enerrgy (Min-ER
R) routing prrotocol)
energgy from the routing
r path
h and greatlyy
reducce network partition
p Caase 3: Chooose a route w with the largge
miniimum residuual energy and less hoop
3.3. Determinati
D ion of routin
ng counnt. i.e. witth the longgest networrk
lifettime (our prooposed routing protocol)).
The optimum ro oute is dettermined by y
usingg the value of described in formulaa Connsider a netw
work illustraated in Figurre
(1). The
T destinattion node caalculates thee 4. H
Here we coonsider a sim mple routinng
valuees of for fo received d all routee exammple to setup
up route from
m source nodde
informmation and choose a ro oute that hass S too destinationn node D. The numbeer
IJCSI International Journal
J of Com
mputer Science Issues, Vol. 8,, Issue 3, No. 11, May 2011
ISSN (Online): 1694 4-0814
www.IJCSI.org 607
writteen on a nodee represents the value off seleccts a route w
with the lonngest lifetim
me
residuual node eneergy. We co onsider threee in tthe network rk without performancce
differrent cases off routes. Sin
nce the Casee degrradation such as delay tiime and nodde
1 co onsiders only the min nimum hop p enerrgy consumpption.
countt, it selects route <S-B--J-D> which h
has the
t hop cou unt of 3. In the Case-2,, 4. P erformancee Evaluation n
selectt route <S-AA-K-F-L-H-G G-D> which h The performannce analysiss of routinng
has Min-RE
M 6 is chosen because
b thee prottocols is evvaluated wiith the NS--2
route has the larrgest minim mum residuall simuulator [7]. Then ouur proposeed
energgy among routes. Ou ur proposed d prottocol is coompared too other tw wo
modeel needs to compute
c thee value of routting protocoll (Case 1 annd Case 2) iin
by ussing formulaa (1), and selects a routee term
ms of the avverage end--to-end delaay
with largest valu ue of . Thus Case 3 and the networkk lifetime.
selectts route <S S-C-E-I-D>, which hass 4.1. S
Simulation E
Environment
largest value off 1.25.
In thhis simulatioon, our experiment modeel
perfformed on 100 nodes, which werre
randdomly deplooyed and disstributed in a
500 500 squaree meter areaa. We assum me
that all nodes haave no mobiility since thhe
nodees are fixedd in applicattions of most
wireeless sensorr networks. Simulationns
are pperformed fofor 60 secondds. We set thhe
proppagation moodel of wirreless sensoor
netw
work as tw wo-ray grounnd reflectioon
moddel and set the maximum m
transsmission raange of noodes as 1000
meteers. The M MAC protoccol is set tto
IEEE E 802.11 and the bbandwidth oof
Figurre 4. A samp
ple network for
f channnel is set too 1Mbps.
establishment of routing path
hs
Each ssensor nodde in thhe
Case 1 selects th he shortest path
p withoutt expeerimental nnetwork is assumed tto
consiidering resid dual energy y of nodes,, havee an initial eenergy levell of 7 Joules.
which h is the sam
me as the AO ODV routing g A nnode consum mes the enerrgy power oof
algorrithm. This case
c does not
n sustain a 600m mW on paacket transsmission annd
long lifetime in thhe network as describedd conssumes the eenergy poweer of 300mW W
in secction 2. Casse 2 selects a route with
h on packet receeption. The used traffi fic
largest minimum m residual energy to o moddel is an U UDP/CBR trraffic modeel.
exten
nd network lifetime but it hass Sizee of data paccket is set too 512byte annd
seriouus problem in terms of the hop p trafffic rate variees to 2, 3,4, 5, 6, 7, 8, 99,
countt. Case-3 im mproves thee drawbackss 10 ppackets/sec to compare performancce
of Case 1 and Case-2 by considering g depeend on ttraffic loaad. In this
both residual en nergy and ho op count. Itt simuulation, the weight coeefficient k is
exten
nds network k lifetime by arranging g calcculated bassed on traaffic modeel,
almost all nodees to invollve in dataa banddwidth, and energy conssumption off a
transffer. The proposed
p prrotocol also
o nodee. Our sim mulation model uses a
IJCSI International Journal
J of Com
mputer Science Issues, Vol. 8,, Issue 3, No. 11, May 2011
ISSN (Online): 1694 4-0814
www.IJCSI.org 608
senso
or network th hat has the bandwidth
b off Figuure 5. Compparison of thhe number oof
1 Mb bps, the paccket size off 512 bytes.. exhaausted energgy nodes
Thus, packet tran nsmission tiime per link k
is callculated, as about 0.0044096secondss Figuure 6 gives the averagge end-to-ennd
and thhe node enerrgy consump ption for ourr delaay of all thhree protocools in respecct
simullation modell is about 0.0
0037 Joule. withh traffic loadds. The AO ODV protocool
has minimum delay and Min-ER haas
4.2. Simulation
S Results
R maxximum delayy. Additionaally, the delaay
of ouur protocol w
was little higgher than thaat
The majo or performaance metricss of A
AODV. Our protocol haas a relativelly
of a wireless
w senssor network are the end-- goodd delay characterisstic withouut
to-ennd delays (or throug ghput) and d degrradation of pperformancee compared tto
netwo ork lifetimee. In order to comparee AOD DV.
netwo ork lifetim me of threee differentt
routinng protocols, we meeasured thee
numb ber of exhau usted energy nodes everyy
seconnd for 60 secondss. Figure-5 5
illustrrates that nu
umber of exhhausted nodee
of eaach model according
a to
o simulationn
time. The verticaal axis is reppresented thee
numb ber of exhausted energy nodes in thee
netwo ork. The inccrease of th he exhaustedd
energgy nodes may m cause a network k
partittion that maakes networrk functionss
impossible. The number of exhausted d
energgy nodes in AODV (Caase 1), Min--
ER (Case
( 2), and
a our prrotocol startt
appeaaring at 35 5, 42,and 47 4 seconds,,
respeectively. Th he numberr in thesee
protoocols is saturrated on 80%
% of nodes att Figuure 6. End--to end delaay for traffi
fic
45, 48,
4 and 55 seeconds, resp pectively. Ass rate
showwn in Figu ure 5, ourr proposed d
protoocol has lo onger lifetim
me duration n Baseed upon thee simulationn results, w we
than other protoccols. In Partticular, 60%% conffirmed that our propoosed protocool
of noodes in our protocol
p woork normallyy can control the rresidual nodde energy annd
at thhe elapsed time of 55 secondss the hop countt in a wirreless sensoor
comp pared to 20 % in otheer protocols.. netw
work and effectively extend thhe
This result sho ows that ouro routing
g netw
work lifetimme without performancce
protoocol properly leads to t balanced d degrradation.
energgy consumpttion of senso or nodes.
5. C
Conclusions
In thhis work, w
we proposed power awarre
routting protocool, which iimproves thhe
lifettime of ssensor nettworks. Thhe
prottocol considders both hoop count annd
IJCSI International Journal
J of Com
mputer Science Issues, Vol. 8,, Issue 3, No. 11, May 2011
ISSN (Online): 1694 4-0814
www.IJCSI.org 609
the residual
r eneergy of noodes in thee Mobbile compuuting and networkingg,
netwo ork. Based upon the NS-2 2 Dalllas, Texas, ppp. 181 -190,, 1998.
simullation, the protocol has been n [5] C
Charles E. P Perkins and Elizabeth MM.
verifiied with verry good performance in n Royyer. "Ad hooc On-demaand Distancce
netwo ork lifetime and end-to-eend delay. Iff Vec tor Routing.." Proceedinngs of the 2nnd
we used a simulation mode of the largee IEEEE Workshopp on Mobille Computinng
numb ber of nodess (or 1000 orr more), ourr Systtems and Appplications, NNew Orleanns,
proto
ocol make networkn lifeetime much h LA, pp. 90-100,, February 19999.
longeer compared d to AODV anda Min-ER R
proto
ocols. Conseequently, ou ur proposed d [6] Charles E. Perkins, "Ad hoc Onn-
proto
ocol can effectively
e extend thee dem
mand Distaance Vectoor (AODV V)
netwo ork lifetiime witho out otherr Rouuting. RFC
C 3561,IET TF MANEET
perfo
ormance degrradation. Worrking Group, July 2003.
The appplications in
i wirelesss [7] IInformationn Sciences Innstitute, "Thhe
senso
or networks may requiire differentt Netwwork Simulator ns-22"
perfo
ormance mettrics. Some applicationss http ://www.isi.eedu/nanam/nns/,
are focused
f on the
t lifetime of network k Univversity of Soouthern California.
and the
t others on delay. Som me efficientt
routin
ng mechan nisms in reespect with h [8] I. F. A Akyildiz, W W. Su, Y Y.
applications mayy be neededd for furtherr Sankkarasubramaaniam and E. Cayircci,
studiees. Wiireless Sensoor Networkss: A Survey,
Commputer Netw works, Vol. 38, No. 44,
Referrences Marrch 20002, pp. 393-4222.
doi: 10.1016/S13389-1286(011)00302-4
[1] Ian F. Akyildiz,
A W.
W Su, Y..
Sankarasubraman niam, and E. Cayirci,,
A survey
s on sensor netwoorks, IEEE
E [9] X. Zhao, K. Mao, S.. Cai and Q Q.
Comm munications Magazine, volume 40,, Che n, Daata-Centric Routinng
Issue 8, pp.102-114, Aug. 2002. Mecchanism U Using Hassh-Value iin
Wireeless Sen- sor Networrk, Wirelesss
[2] K.
K Akkaya and M. Younis,Y "A
A Senssor Networkk, Vol. 2, Noo. 9, 2010, ppp.
Surveey of Routin
ng Protocols in Wirelesss 703--709. doi:100.4236/wsn.22010.29086
Sensoor Networkss, " in the Elsevier
E Add
Hoc Network Jo ournal, Vol 3/3,
3 pp.325-- BIO
OGRAPHY
349, 2005.
Firsst Author :
[3] Q. Jiang and D. Manivannan,
M ,
Rouuting protoco
ols for sensorr networks, Rajeesh Kumarr Sahoo is presentlly
Proceeedings of CCNC
C 20044, pp.93-98,, workking as Asssistant Profeessor in Ajaay
Jan. 2004.
2 Binaay Institute O
Of
Techhnology,Cutttack,Orissa,,India. He haas
[4] Suresh Sin ngh and MikeM Woo,, acquuired his M
M.Tech degrree fromKIIIT
"Pow
wer-aware roouting in mo obile ad hocc Univversity, KIIIT,Bhubanesswar, Orissaa,
netwo
orks", Proceeedings of th
he 4th annuall Indiia. He is a researchh student oof
ACMM/IEEE interrnational co onference on
n Berhhampur Unniversity,Berrhampur. H He
has contributed more than ttwo papers tto
IJCSI International Journal
J of Com
mputer Science Issues, Vol. 8,, Issue 3, No. 11, May 2011
ISSN (Online): 1694 4-0814
www.IJCSI.org 610
Journ
nals and Procceedings. Hee has written
n Univversity. He is a life member of CS SI,
two books on Computer
Architecture
A e ISTEE & OITS, aand a Fellow w of ACEEE E.
and Organizatiion and Computerr His special fiield of interests arre
Architecture andd organizatioon-II. Hiss Intellligent Ageents, Serviice Orienteed
areas of intereests are in n Softwaree Systtem Modeling, Daata miningg,
Enginneering, Ob
bject Orienteed Systems,, Netwwork Intrusiion Detectionn.
Sensoor Network, Computer Architecture
A e
and Compiler
C Deesign etc.
Secon
nd Author :
Third
d Author:
Dr. Durga
D Prasad
d Mohapatraa studied hiss
M.Teech at National
N In
nstitute off
Techn nology,Rourrkela, Indiaa. He hass
receivved his Ph. D from Ind dian Institutee
of Technnology,Kharragpur,India..
Curreently, he is working as a Associatee
Profeessor at National
N Institute
I off
Techn nology, Rou urkela. His special fieldss
of intterest include Software Engineering,
E ,
Discrrete Mathem matical Struccture, slicing
g
Objecct-Oriented Programm ming. Real--
time Systems and d distributed
d
comp puting.
Fourrth Author:
Dr. Manas
M Ranjan Patra hoolds a Ph.D..
degreee in comp puter Sciencce from thee
Centrral Univerrsity of Hyderabad..
Curreently, he heeads the Deepartment off
Comp puter Science,
S Berhampurr
IJCSI CALL FOR PAPERS SEPTEMBER 2011 ISSUE
Volume 8, Issue 5
The topics suggested by this issue can be discussed in term of concepts, surveys, state of the
art, research, standards, implementations, running experiments, applications, and industrial
case studies. Authors are invited to submit complete unpublished papers, which are not under
review in any other conference or journal in the following, but not limited to, topic areas.
See authors guide for manuscript preparation and submission guidelines.
Accepted papers will be published online and indexed by Google Scholar, Cornells
University Library, DBLP, ScientificCommons, CiteSeerX, Bielefeld Academic Search
Engine (BASE), SCIRUS, EBSCO, ProQuest and more.
All submitted papers will be judged based on their quality by the technical committee and
reviewers. Papers that describe on-going research and experimentation are encouraged.
All paper submissions will be handled electronically and detailed instructions on submission
procedure are available on IJCSI website (www.IJCSI.org).
IJCSI
TheInternationalJournalofComputerScienceIssues(IJCSI)isawellestablishedandnotablevenue
for publishing high quality research papers as recognized by various universities and international
professional bodies. IJCSI is a refereed open access international journal for publishing scientific
papers in all areas of computer science research. The purpose of establishing IJCSI is to provide
assistance in the development of science, fast operative publication and storage of materials and
resultsofscientificresearchesandrepresentationofthescientificconceptionofthesociety.
Italsoprovidesavenueforresearchers,studentsandprofessionalstosubmitongoingresearchand
developments in these areas. Authors are encouraged to contribute to the journal by submitting
articlesthatillustratenewresearchresults,projects,surveyingworksandindustrialexperiencesthat
describesignificantadvancesinfieldofcomputerscience.
IndexingofIJCSI
1.GoogleScholar
2.BielefeldAcademicSearchEngine(BASE)
3.CiteSeerX
4.SCIRUS
5.Docstoc
6.Scribd
7.Cornell'sUniversityLibrary
8.SciRate
9.ScientificCommons
10.DBLP
11.EBSCO
12.ProQuest
IJCSI PUBLICATION
www.IJCSI.org